Multivariate Analysis Coupled with M-SVM Classification for Lard Adulteration Detection in Meat Mixtures of Beef, Lamb, and Chicken Using FTIR Spectroscopy

Adulteration of meat products is a delicate issue for people around the globe. The mixing of lard in meat causes a significant problem for end users who are sensitive to halal meat consumption. Due to the highly similar lipid profiles of meat species, the identification of adulteration becomes more difficult. Therefore, a comprehensive spectral detailing of meat species is required, which can boost the adulteration detection process. The experiment was conducted by distributing samples labeled as “Pure (80 samples)” and “Adulterated (90 samples)”. Lard was mixed with the ratio of 10–50% v/v with beef, lamb, and chicken samples to obtain adulterated samples. Functional groups were discovered for pure pork, and two regions of difference (RoD) at wavenumbers 1700–1800 cm−1 and 2800–3000 cm−1 were identified using absorbance values from the FTIR spectrum for all samples. The principal component analysis (PCA) described the studied adulteration using three principal components with an explained variance of 97.31%. The multiclass support vector machine (M-SVM) was trained to identify the sample class values as pure and adulterated clusters. The acquired overall classification accuracy for a cluster of pure samples was 81.25%, whereas when the adulteration ratio was above 10%, 71.21% overall accuracy was achieved for a group of adulterated samples. Beef and lamb samples for both adulterated and pure classes had the highest classification accuracy value of 85%, whereas chicken had the lowest value of 78% for each category. This paper introduces a comprehensive spectrum analysis for pure and adulterated samples of beef, chicken, lamb, and lard. Moreover, we present a rapid M-SVM model for an accurate classification of lard adulteration in different samples despite its low-level presence.


Introduction
The verification of authenticity and the detection of adulterants are critical aspects of food control, particularly in high-value items. As a measure of food quality and authenticity, laboratory data as well as chemical, physical, and visual pictures of foodstuffs are employed. The authenticity of the food is a major concern in the worldwide food industry; with the abundance of packaged food with a lengthy supply chain on the market, food

Meat Sample Collection
All meat samples were obtained from the local market at Seri Iskander in Malaysia. After that, the meat was washed with purified water and cut into small parts (1 cm × 1 cm) and held at −10 • C. Total samples were then divided into two different classes, as pure and adulterated. There were 80 pure and 90 adulterated samples produced for the spectral analysis. The sample preparation was designed to be straightforward, with no extra chemical substances used. Beef, lamb, and chicken loin cuts were used, and all pork was lean meat taken from chops.

Extraction Procedure and Sample Distribution
Lard and other animal body fats from meat such as chicken fat, beef fat, and mutton fat were extracted according to the method stated by [34], with little variation. All samples were gradually heated from 50 • C to 150 • C for 45 min until the fat was extracted from all the samples on the petri dish. The discharged fat was then filtered as the concentration contained solid minute particles. Moreover, samples were centrifuged at 3000 rpm for 20 min and filtered through Whatman filter paper. Pure fats produced by the extraction process were then used to make adulterated samples. All the chemicals used in this experiment were of analytical consistency. Pure and adulterated fats were then analyzed using FTIR spectroscopy. The instrument used was Frontier FT-IR by PerkinElmer. The optical system with KBr beam splitter was used to enable quality data collection over a range of 8300-350 cm −1 at a best resolution of 0.4 cm. The resulting spectrum contained 2500 continuous values for one sample, with intervals of 0.8 cm −1 . To guarantee that there was no major fluctuation between each spectra scanned, each spectrum was recorded at the same temperature. This procedure was required to remove any uncontrolled ambient influences on the instrument and the sample.

Spectral Data Pre-Processing
Smoothing and normal variate transformation (SNV) were used as spectrum preprocessing approaches in this investigation. The reflectance spectra were smoothed by Savitzky-Golay smoothing using a second-order polynomial and a 5-point window to eliminate the random disturbances caused by the system's internal components. SNV was used to adjust for scatter effects and reduce slope variation. The Savitzky-Golay smoothing filter was used to increase the precision of the data without distorting the signal tendency.

Preparing Mixture Samples
Lard was mixed with body fats of lamb, beef, and chicken to obtain a series of standard or trained sets of 80 pure and 90 adulterated samples containing 10-50% v/v of lard in lamb, beef, and chicken samples, as shown in Table 2. The following method is according to Rohman et al. [23]. We prepared six pieces for each combination of lard mixed with a defined percentage of lamb, chicken, and beef, with pork in the proportion of 10, 20, 30, 40, and 50%, whereas B-50%, L-50%, and C-50% represent a 50-50 ratio of pork with beef, lamb, and chicken, respectively; meanwhile, B-90%, L-90%, and C-90% indicate 10% lard with 90% of the respective species. The detailed distribution of samples is presented in Table 3.  Table 3. Composition of adulterated samples with the ratio of lard mixed with samples of beef, lamb, and chicken, represented by their initials (Lamb: L-90% to L50%, Beef: B-90% to B-50%, Chicken: C-90% to C-50%).

Mixture Samples Label
Pork

Number of Samples
Total Mixture Samples 90

Results and Discussion
After a careful process of sample-making and data pre-processing, the obtained spectrum for both pure and adulterated samples was analyzed separately. The developed workflow for further investigating the lard adulteration was carried out using a three-stage process. In the first stage, identification of functional groups in lard samples without any contamination was made. Secondly, pure spectral samples of beef, lamb, chicken, and lard were analyzed by overlapping the spectrums and identifying the region of difference (RoD) for highly significant regions. Moreover, the profiling of adulterated samples with the percentage difference for beef, lamb, and chicken was also carried out. After spectral analysis, the third and final stage combined the multivariate analysis with M-SVM classification for both pure and adulterated samples separately. Samples were divided into two classes, 'Haram (lard)' and 'Halal (chicken, lamb, and beef)', for M-SVM classification.

FTIR Spectra Analysis of Pure Samples
Amid the four different meat fats, the pure lard used in this study was evaluated and analyzed separately using FTIR spectroscopy. The peak is shown in Figure 1 approximately at wavenumber 2921 cm −1 , which was due to the tensile vibration of C-H (Sp 3 ) in = C-H cis. The functional group-CH 2 provided peaks at wavenumber 2853 cm −1 consecutively as result of asymmetrical and symmetrical vibration. The peak showed the triglyceride ester carbonyl (C=O) group at wavenumber 1750 cm −1 .
In the fingerprint region, vibrations of the stretching mode from the C-O group in esters were detected at wavenumber 1155 cm −1 , while at wavenumber 1467 cm −1 the bending vibrations of the CH 2 and CH 3 aliphatic groups were detected, as shown in Figure 1. Table 4 shows the details of wavenumber and the associated vibration of functional groups for the pure lard sample. Figure 2 below shows the FTIR spectra of pure samples overlapped for the identification of wavenumbers, with associated peaks identified as the region of difference (RoD) along with the fingerprint region. This spectrum can be divided into three regions to make the analysis convenient: the first region range is at wavenumber 3000-2500 cm −1 , the second region range is 2000-2500 cm −1 , the third region range is 1500-2000 cm −1 , and to conclude, the fingerprint region range is at wavenumber 1500-500 cm −1 . Two separate regions are highlighted with dotted lines (a and b), with the overlapping of pure samples for all species, as indicated in Figure 2, where the change in absorbance values is highly prominent; wavenumbers associated with these two regions are in the spectrum ranges of 1700-1800 cm −1 for RoD(a) and 2800-3000 cm −1 for RoD(b) respectively as shown in Figure 3. The FTIR spectra of all the lipids obtained from different species were combined and overlapped.  Figure 2. below shows the FTIR spectra of pure samples overlapped for th cation of wavenumbers, with associated peaks identified as the region of differe along with the fingerprint region. This spectrum can be divided into three regio the analysis convenient: the first region range is at wavenumber 3000-2500 cm ond region range is 2000-2500 cm −1 , the third region range is 1500-2000 cm −1 , a clude, the fingerprint region range is at wavenumber 1500-500 cm −1 . Two separ are highlighted with dotted lines (a and b), with the overlapping of pure sam species, as indicated in Figure 2, where the change in absorbance values is hig nent; wavenumbers associated with these two regions are in the spectrum rang 1800 cm −1 for RoD(a) and 2800-3000 cm −1 for RoD(b) respectively as shown in Fig  FTIR spectra of all the lipids obtained from different species were combined lapped.      As the value for the adulteration of lard increases for both beef and chicken, the absorbance values merge with the lard, showing high contrast compared to lamb samples, which indicates negligible change when lard is mixed. This is clearly visible in the spectral analysis shown in Figure 4 for all the adulterated samples. The absorbance values in the region of RoD(b) are carefully analyzed, where the adulteration of lard can potentially be detected. This is shown in Table 5. On the other hand, beef samples are highly prone, and lard is detectable because of the significant change in absorbance value at the region of 2800-3000 cm −1 in the spectrum, specifically at RoD(b) a and b, which represent regions at 2840-2860 and 2900-2940 cm −1 , respectively. Table 5 lists all the absorbance values at the peaks of RoD(b) in Figure 2; the percentage difference is calculated with respect to lard for peak absorbance in regions with high significance. rbance values and percentage difference with respect to lard for adulterated samples of beef, lamb, and region of RoD(b) at the highly significant region of 2800-3000 cm −1 .

Sample
Absorbance Value at  As the value for the adulteration of lard increases for both beef and chicken, the absorbance values merge with the lard, showing high contrast compared to lamb samples, which indicates negligible change when lard is mixed. This is clearly visible in the spectral analysis shown in Figure 4 for all the adulterated samples. The absorbance values in the region of RoD(b) are carefully analyzed, where the adulteration of lard can potentially be detected. This is shown in Table 5. On the other hand, beef samples are highly prone, and lard is detectable because of the significant change in absorbance value at the region of 2800-3000 cm −1 in the spectrum, specifically at RoD(b) a and b, which represent regions at 2840-2860 and 2900-2940 cm −1 , respectively. Table 5 lists all the absorbance values at the peaks of RoD(b) in Figure 2; the percentage difference is calculated with respect to lard for peak absorbance in regions with high significance.
The highest proximity of absorbance values to pure lard can be seen in the samples of B-50%, C-90%, C-80%, and C-50%, for both regions RoD(b)-a and RoD(b)-b. At the same time, adulterated beef shows a pattern of variation according to the adulteration percentage of lard. Beef samples with 10% adulteration (B-90%) have an approximate percentage difference of 7-14%, while beef with 50% adulteration (B-50%) shows approximately 3-8% change for both regions. All samples containing adulterated chicken from C-50% to C-90% show the lowest percentage difference as compared to lamb and beef. This reveals the highest similarity to be between chicken and lard, which could present some difficulty in detecting the adulteration of lard in chicken irrespective of the percentage mixing. Moreover, adulterated lamb samples depict minor variation in absorbance values throughout the mixing samples (L-50% to L-90%) and have the highest percentage difference as compared to pure lard.

Results of Principal Component Analysis
Pure lard, along with other samples of beef, chicken, and lamb, was classified using the chemometric of PCA. PCA is used to reduce the dimension of the spectral signal. The wavenumber regions for PCA were also optimized. To confirm the separation based on adulterant type, the raw data (eigenvectors of the covariance matrix) was subjected to principal component analysis (PCA). Further explanation on PCA is at Appendix A.1   The highest proximity of absorbance values to pure lard can be seen in the samples of B-50%, C-90%, C-80%, and C-50%, for both regions RoD(b)-a and RoD(b)-b. At the same time, adulterated beef shows a pattern of variation according to the adulteration percentage of lard. Beef samples with 10% adulteration (B-90%) have an approximate percentage difference of 7-14%, while beef with 50% adulteration (B-50%) shows approximately 3-8% change for both regions. All samples containing adulterated chicken from C-50% to C-90% show the lowest percentage difference as compared to lamb and beef. This reveals the highest similarity to be between chicken and lard, which could present some difficulty in detecting the adulteration of lard in chicken irrespective of the percentage mixing. Moreover, adulterated lamb samples depict minor variation in absorbance values throughout the mixing samples (L-50% to L-90%) and have the highest percentage difference as compared to pure lard.  Table 5. Absorbance values and percentage difference with respect to lard for adulterated samples of beef, lamb, and chicken in the region of RoD(b) at the highly significant region of 2800-3000 cm −1 . It is possible to observe a distinct split depending on the level of adulteration by showing the scores of the first two main components ( Figure 5), which represent 99.36 percent of data variance. Only a little amount of overlap exists between the chicken samples that have been tainted with pork. The selection of wavenumbers was based on their ability to provide a useful classification between samples, as seen in Figure 5. The PCA plot showed clusters of samples based on their similarity with the first main component (PC1) and the second main component (PC2), which provided a good separation between the lamb, beef, and pork groups but was unable to separate pork and chicken. The percentage (%) variability of PC1 and PC2 was 97.31% and 2.05%, respectively. PC1 comprised the most variation of the data, as shown in Table 6.

Species
It is possible to observe a distinct split depending on the level of adulteration showing the scores of the first two main components ( Figure 5), which represent 99 percent of data variance. Only a little amount of overlap exists between the chicken s ples that have been tainted with pork. The selection of wavenumbers was based on th ability to provide a useful classification between samples, as seen in Figure 5. The P plot showed clusters of samples based on their similarity with the first main compon (PC1) and the second main component (PC2), which provided a good separation betw the lamb, beef, and pork groups but was unable to separate pork and chicken. The p centage (%) variability of PC1 and PC2 was 97.31% and 2.05%, respectively. PC1 co prised the most variation of the data, as shown in Table 6.  The FTIR spectra of the pure pork sample were compared with those of adultera beef, chicken, and lamb. Three dimensional plots are shown in Figure 6. The PCA anal shows the PCA projection divided into three dimensions for better analysis.  The FTIR spectra of the pure pork sample were compared with those of adulterated beef, chicken, and lamb. Three dimensional plots are shown in Figure 6. The PCA analysis shows the PCA projection divided into three dimensions for better analysis. Figure 6a shows the distribution of samples across the first principal component using 1D spectra of the pure samples for beef, lamb, chicken, and pork, where chicken and pork samples overlap and correlate highly coupled values of absorbance with similar wavenumbers. At the same time, Figure 6b depicts the samples at PC1 and PC2 using 2D representation for all the adulterated species. Figure 6c combines all the three principal components using 3D for all the adulterated samples. The regions in these figures are separated based on the adulteration quantity, starting with slightly mixed, i.e., 10%, to highly adulterated, i.e., 50%. In the first projection, the plotted points representing the samples of chicken, beef, and lamb are scattered, and they are far from the pork group. The closer the dots of chicken, beef, and lamb are to the pork samples, the more significant the quantity of lard is in pure samples.

Multiclass Support Vector Machine Classification
The data obtained from the previous processes were divided into testing data (30%) and training data (70%), and subsequently evaluated with the classification model. The data acquired from the FTIR spectroscope was analyzed using the scikit-learn machine learning library in Python. The radial basis function (RBF) was used as the kernel function of SVM using the grid search method. To add an extra validation step to our model, we used the confusion matrix for both multiclass datasets, as shown in Tables 7 and 8. The confusion matrix projects the true data against predicted data. In our study, we divided the problem into two different sections: one identified pure samples correctly, and the other predicted the adulterated samples. The learning rate was 0.0001, and the regularization parameter λ was set to 1/epochs. Table 7 illustrates the user, producer, and overall accuracy of the pure samples data set. Details of the SVM is explained at Appendix A.2. Pure samples of beef and lamb using optimal parameters produced the highest accuracy (85%) among all the samples. Furthermore, pure samples of chicken had the lowest accuracy of 75%, whereas pure pork was significantly better than chicken, with 80% accuracy. Moreover, Figure 7 shows a confusion matrix using a 10-fold cross-validation for the pure samples where the a, b, and c rows represent the true label; meanwhile, according to the model prediction, the a, b, and c columns represent the number of predicted sets for each respective class.  Figure 6a shows the distribution of samples across the first principal component using 1D spectra of the pure samples for beef, lamb, chicken, and pork, where chicken and pork samples overlap and correlate highly coupled values of absorbance with similar wavenumbers. At the same time, Figure 6b depicts the samples at PC1 and PC2 using 2D representation for all the adulterated species. Figure 6c combines all the three principal components using 3D for all the adulterated samples. The regions in these figures are separated based on the adulteration quantity, starting with slightly mixed, i.e., 10%, to highly adulterated, i.e., 50%. In the first projection, the plotted points representing the samples of chicken, beef, and lamb are scattered, and they are far from the pork group. The closer the dots of chicken, beef, and lamb are to the pork samples, the more significant the quantity of lard is in pure samples.

Multiclass Support Vector Machine Classification
The data obtained from the previous processes were divided into testing data (30%) and training data (70%), and subsequently evaluated with the classification model. The data acquired from the FTIR spectroscope was analyzed using the scikit-learn machine learning library in Python. The radial basis function (RBF) was used as the kernel function of SVM using the grid search method. To add an extra validation step to our model, we used the confusion matrix for both multiclass datasets, as shown in Tables 7 and 8. The    The predicted labels for pure samples shown in Figure 7 misclassified three of pure chicken as pure pork, while two samples of pure pork were falsely lab chicken. Moreover, beef and lamb both had three label misclassifications, one species of meat. Table 8 shows the confusion matrix for the multiclass SVM of adulterated da ples. The adulterated data set contained all the samples that were adulterated wit ent proportions of lard. The AdulteratedBeef sample included samples with a v from B-50% to B-90%. The producer accuracy was highest for AdulteratedLamb a whereas AdulteratedBeef had the second-highest value of 73.3%. The spectrum had no change in absorbance value when it was adulterated, irrespective of the a tion ratio, which was also validated by the SVM classifier by getting the maximu ber of correctly classified labels, as shown in Figure 8. The predicted labels for pure samples shown in Figure 7 misclassified three samples of pure chicken as pure pork, while two samples of pure pork were falsely labeled as chicken. Moreover, beef and lamb both had three label misclassifications, one for each species of meat. Table 8 shows the confusion matrix for the multiclass SVM of adulterated data samples. The adulterated data set contained all the samples that were adulterated with different proportions of lard. The AdulteratedBeef sample included samples with a v/v ratio from B-50% to B-90%. The producer accuracy was highest for AdulteratedLamb at 76.6%, whereas AdulteratedBeef had the second-highest value of 73.3%. The spectrum of lamb had no change in absorbance value when it was adulterated, irrespective of the adulteration ratio, which was also validated by the SVM classifier by getting the maximum number of correctly classified labels, as shown in Figure 8.  AdulteratedChicken samples, with 20 correctly classified samples, produced est precision accuracy of 66% due to its high variation in absorbance values, as sh Figure 8.

Conclusions
FTIR spectroscopy, coupled with the multivariate and M-SVM methods, seem an efficient and rapid technique for the discrimination of lard from other meat s In this paper, we demonstrated the identification and discrimination of lard fro chicken, and lamb fats in meat mixtures. FTIR spectral analysis in combination w cipal Component Analysis (PCA) and M-SVM have shown that pure lard fat has AdulteratedChicken samples, with 20 correctly classified samples, produced the lowest precision accuracy of 66% due to its high variation in absorbance values, as shown in Figure 8.

Conclusions
FTIR spectroscopy, coupled with the multivariate and M-SVM methods, seems to be an efficient and rapid technique for the discrimination of lard from other meat samples. In this paper, we demonstrated the identification and discrimination of lard from beef, chicken, and lamb fats in meat mixtures. FTIR spectral analysis in combination with Principal Component Analysis (PCA) and M-SVM have shown that pure lard fat has unique peaks that can distinguish the pork from beef, chicken, and lamb meat at wavenumbers 1155 cm −1 , 1467 cm −1 , 1750 cm −1 , and 2921 cm −1 . The absorbance values indicate a direct correlation between lard and other species. The PCA results show that adulteration in chicken meat is positively correlated with pork meat, while lamb is negatively correlated with respect to lard. The SVM model produced an overall prediction accuracy of 81.25% for pure samples, and for adulterated samples, it showed a 72.2% prediction accuracy. The overall accuracy was computed using the sensitivity and precision values. The model accurately classified the pure samples better than the adulterated samples due to a smaller number of samples and the minimalistic difference in absorbance values of the spectrum. Thus, this study has the potential to establish as a rapid method for halal authentication and could revolutionize the in-line quality control in the meat industry. For future work, the FTIR profiles for pure and adulterated samples can be increased, and deep learning may be applied for detecting an adulteration quantity of less than 10%. Funding: This study is funded by Centre of Graduate Studies, UTP in collaboration with ITI institute for smart mobility, University Technology Petronas, Perak, Malaysia.
Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.
Data Availability Statement: Data presented in this study is available at request from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A Appendix A.1. Principal Component Analysis
Principal Component Analysis (PCA) is a statistical technique that is particularly useful in reducing observations that have many dimensions. This technique consists of transforming dimensions of a dataset into a new but smaller set of uncorrelated dimensions called principal components (PCs). An array of (q ij ) values can be normalized using the equation below: The data given to us is the array element data corresponding to the variable X ij , and the mean value of the variable q j . Then, using the new dataset array, a correlation matrix is constructed so that information about how the variables in the dataset are correlated can be obtained. To create our new correlation matrix X with the new correlation coefficients X ij , the following formula is used: Only the principal components that explain the greatest amount of data in the original are determined using the equation below: where S is the matrix data, known as Score; V is the eigenvectors; and Q is the original data array. The matrix S (Score) will now represent the data in a way that each column represents the projection of the initial data Q.

Appendix A.2. Support Vector Machine Classification
Most machine learning techniques have been created and statistically verified for linearly separable data. For the reduction of dimensionality, linear classifiers such as Support Vector Machines (SVMs) or the (conventional) Principal Component Analysis (PCA) are common examples. However, to efficiently accomplish tasks involving pattern analysis and discovery, most real-world data require non-linear approaches. By incorporating the kernel trick, the SVM approach has improved over time. To detect a pattern in non-linear separable data, the kernel method effectively translates the input data to higher dimensions. When the training data has many variables in comparison to the number of observations, SVMs are an excellent classification approach. In SVM, every sample x that consists of n variables is treated as an n-dimensional vector. Prediction performance can be assessed using the following three indicators: sensitivity (User Accuracy), precision (Producer Accuracy), and overall accuracy. Precision is the proportion of appropriately positive labels produced by our software to all positive labels produced. The ratio of the exactly positive labels identified by our algorithm to all positive labels is referred to as sensitivity. Accuracy is the proportion of correctly categorized topics to the total number of issues. Equations (A4)-(A6) present the formula for Precision, Accuracy, and Sensitivity.