Raw Beef Patty Analysis Using Near-Infrared Hyperspectral Imaging: Identification of Four Patty Categories

South African legislation regulates the classification/labelling and compositional specifications of raw beef patties, to combat processed meat fraud and to protect the consumer. A near-infrared hyperspectral imaging (NIR-HSI) system was investigated as an alternative authentication technique to the current destructive, time-consuming, labour-intensive and expensive methods. Eight hundred beef patties (ca. 100 g) were made and analysed to assess the potential of NIR-HSI to distinguish between the four patty categories (200 patties per category): premium ‘ground patty’; regular ‘burger patty’; ‘value-burger/patty’ and the ‘econo-burger’/’budget’. Hyperspectral images were acquired with a HySpex SWIR-384 (short-wave infrared) imaging system using the Breeze® acquisition software, in the wavelength range of 952–2517 nm, after which the data was analysed using image analysis, multivariate techniques and machine learning algorithms. It was possible to distinguish between the four patty categories with accuracies ≥97%, indicating that NIR-HSI offers an accurate and reliable solution for the rapid identification and authentication of processed beef patties. Furthermore, this study has the potential of providing an alternative to the current authentication methods, thus contributing to the authenticity and fair-trade of processed meat products locally and internationally.


Introduction
Adulteration of food has a long history and is still a common practice around the world. Food adulteration occurs to increase financial return, where a high-value food is contaminated by adding a low-value ingredient. Adulteration is not only a form of consumer fraud and economic misconduct, but it also poses a risk to people who are allergic to certain ingredients or meat species, have religious prohibitions, or have ethical aversions [1]. Furthermore, it seriously violates the rights of consumers.
Meat adulteration is a current issue with economic and safety implications because it is difficult to detect specific ingredients/adulterants and/or distinguish between species when evaluating meat visually [2]. In the past, meat adulteration scandals have included undeclared soy protein incorporation in Brazilian hamburgers [3], meat from undeclared animal species in Mexican hamburgers and sausages [4], undeclared animal species in various meat products in the USA [5], and Turkey [6], and the incorporation of undeclared horsemeat in processed beef products in the EU [7,8]. In South Africa, Cawthorn et al. [9] discovered species such as chicken, goat, water buffalo, and donkey in beef sausages that were not disclosed on the product labelling. Following such reports, consumers became concerned about the traceability and origin of the food they eat.    The lean meat [beef (Bos taurus), pork (Sus scrofa domesticus), lamb (Ovis aries), ostrich (Struthio camelus)] and fat samples were obtained from Birdstreet Butchery (Stellenbosch, South Africa), in October 2020, vacuum sealed and stored at 4 • C until used. The meat consisted of deboned and defatted muscles while the fat comprised of subcutaneous fat trimmed from beef carcasses. The lean meat and beef fat were derived from numerous animal carcasses. The mechanically recovered meat (MRM) [poultry (Gallus gallus domesticus)] was obtained from Deli Species (Cape Town, South Africa), vacuum sealed and frozen at −20 • C until needed. The Burger Ready spice packs, burger rusk, textured vegetable protein (TVP) and soya fibre were also supplied by Deli Spices.

Production
A total of 800 patties were produced at Deli Spices according to the different formulations and treatments shown in Tables 1-4. Prior to patty production, the beef, pork, lamb, ostrich, MRM and fat were separately minced with a meat mincer (Mainca PC-32, Equipamientos Cárnicos, Barcelona, Spain) using 4.5 mm plate openings. All the ingredients for each patty formulation and treatment was thoroughly mixed in separate mixing bowls to ensure the even distribution of meat, fat and other added ingredients in the patty batter. A Butcherquip hand operated patty machine (Butcherquip, Roodepoort, South Africa) was used to press the patty batter into 100 g patties (21 mm thick) (Deli Spices), which were then packaged in Styrofoam patty containers, wrapped in cling film and labelled according to formulation and treatment. After the patties were produced, they were placed in plastic containers and transported to the Department of Animal Sciences, Stellenbosch University (SU) for storage at 4 • C (Formulation 1, 2 and 3) and −20 • C (Formulation 4) until analysed. The patties for the ground burger category (Figure 1) were manufactured from beef meat and beef fat only, therefore containing no other added ingredients. These patties had a total meat (TM) content ≥99.6% with varying added fat percentages (treatments) to represent the fat content claim (extra lean, lean, regular) according to regulation [10] ( Table S3).

Formulation 2: Patty 2 (P2)-Species adulteration
The patties for the burger/patty/hamburger patty category ( Figure 2) were manufactured from beef meat and beef fat (10% added), with a TM content ≥70%. According to the regulation these patties may consist of species mixtures, however a minimum of 75% of the mixture must consist of the meat of the predominant species mentioned on the packaging. Thus, a maximum of 25% thereof may consist of meat from any other animal, bird or game species. These patties may also contain other added ingredients as stated in the regulation (Table S3). For this formulation, the beef fat, spices and water were kept constant across the treatments. These treatments consisted of a control (75% Beef) and nine substitutions with three treatments per species (pork, lamb, ostrich). Treatment 1 and 2 for each species were the 'authentic' substitutions, with treatment 3 pertaining to the adulterated patties.

Formulation 2: Patty 2 (P2)-Species adulteration
The patties for the burger/patty/hamburger patty category ( Figure 2) were manufactured from beef meat and beef fat (10% added), with a TM content ≥70%. According to the regulation these patties may consist of species mixtures, however a minimum of 75% of the mixture must consist of the meat of the predominant species mentioned on the packaging. Thus, a maximum of 25% thereof may consist of meat from any other animal, bird or game species. These patties may also contain other added ingredients as stated in the regulation (Table S3). For this formulation, the beef fat, spices and water were kept constant across the treatments. These treatments consisted of a control (75% Beef) and nine substitutions with three treatments per species (pork, lamb, ostrich). Treatment 1 and 2 for each species were the 'authentic' substitutions, with treatment 3 pertaining to the adulterated patties.

Formulation 3: Patty 3 (P3)-Textured vegetable protein adulteration
The patties for the value burger/value patty category ( Figure 3) were manufactured from beef meat and beef fat (10% added), with a TM content ≥55% and a minimum total meat equivalent (TME) of 60%. According to the regulation these patties my contain other added ingredients (Table S3). For this formulation, the beef fat and spices were kept con-

Formulation 3: Patty 3 (P3)-Textured vegetable protein adulteration
The patties for the value burger/value patty category ( Figure 3) were manufactured from beef meat and beef fat (10% added), with a TM content ≥55% and a minimum total meat equivalent (TME) of 60%. According to the regulation these patties my contain other added ingredients (Table S3). For this formulation, the beef fat and spices were kept constant across the treatments, with the beef meat, water, burger rusk and TVP varying for each treatment. The treatments with a TM% and a TME% above 55% and 60%, respectively, were the authentic patties, with the treatments below these limits pertaining to the adulterated patties.

Formulation 4: Patty 4 (P4)-Mechanically recovered meat adulteration
The patties for the econo burger/budget burger category ( Figure 4) were manufactured from beef meat, beef fat (10% added) and MRM (added at different percentages), with a TM content ≥35% and a minimum TME of 55%. These patties may also contain other added ingredients as stated in the regulation (Table S3). For this formulation, the beef fat, spices and TVP were kept constant across the treatments, with the beef meat, MRM, water, soya fibre and burger rusk varying for each treatment. The treatments with a TM% and a TME% above 35% and 55%, respectively, were the authentic patties, with the treatments below these limits resulting in the adulterated patties.

Formulation 4: Patty 4 (P4)-Mechanically recovered meat adulteration
The patties for the econo burger/budget burger category ( Figure 4) were manufactured from beef meat, beef fat (10% added) and MRM (added at different percentages), with a TM content ≥35% and a minimum TME of 55%. These patties may also contain other added ingredients as stated in the regulation (Table S3). For this formulation, the beef fat, spices and TVP were kept constant across the treatments, with the beef meat, MRM, water, soya fibre and burger rusk varying for each treatment. The treatments with a TM% and a TME% above 35% and 55%, respectively, were the authentic patties, with the treatments below these limits resulting in the adulterated patties.

Formulation 4: Patty 4 (P4)-Mechanically recovered meat adulteration
The patties for the econo burger/budget burger category ( Figure 4) were manufactured from beef meat, beef fat (10% added) and MRM (added at different percentages), with a TM content ≥35% and a minimum TME of 55%. These patties may also contain other added ingredients as stated in the regulation (Table S3). For this formulation, the beef fat, spices and TVP were kept constant across the treatments, with the beef meat, MRM, water, soya fibre and burger rusk varying for each treatment. The treatments with a TM% and a TME% above 35% and 55%, respectively, were the authentic patties, with the treatments below these limits resulting in the adulterated patties.

Moisture, Fat and Protein Analysis
Patties from each formulation and treatment were tested in duplicate for moisture, fat and protein. The moisture content was determined by drying the homogenized patties at 100 • C for 48 h, according to the Official AOAC Method 934.01 [24]. The crude fat content was determined using the chloroform/methanol (1:2 and 2:1) extraction method as described by Lee et al. [25]. Thereafter, the defatted samples from the chloroform/methanol extraction method were dried, ground to a fine powder and used for the protein content determination. The crude protein was determined using the Dumas combustion method

Near-Infrared Hyperspetral Imaging System
Hyperspectral images were acquired with a HySpex SWIR-384 (Norsk Elektro Optikk, Norway) imaging system, in reflectance mode, using the Breeze ® (Prediktera) acquisition software. The system's camera comprised of a mercury-cadmium-telluride (HgCdTe) detector with a maximum frame rate of 400 frames per second (fps). The samples were illuminated with two 150 W halogen lamps, mounted on a laboratory rack 30 cm above the translation stage at a 53 • angle. Individual images were acquired using a 30 cm focal length lens at a working distance of 0.208 m and a field of view of 95 mm within the spectral range of 952-2517 nm at increments of 5.45 nm between each of the 288 wavelength channels. These images consisted of 384 pixels in width (x) and varied in length (y). Grey and internal black reference standards were taken every 30 min during the imaging session and used for image correction and calibration. The grey reference standard (Zenith Polymer ® Reflectance Standards) comprised of a 50% diffuse reflectance polytetrafluoroethylene (PTFE) surface and the black reference was recorded with the shutter closed. Using the 50% (grey) instead of a 99% (white) reflectance target results in no significant differences in the calculated reflectance values of the sample. The 50% target enables the use of longer integration times, which would normally saturate the 99% target [27]. Because of the longer integration times (3,200 µs), the samples in this study have an increased signal-to-noise ratio.

Image Acquisition and Correction
Prior to imaging, the patties were removed from the 4 • C (Formulation 1, 2 and 3) and −20 • C (Formulation 4) storage and placed at ambient temperature (ca. 23 • C) for 30 min. Thereafter, the surface of the patties was blotted dry, with an absorbent tissue paper, to remove the excess moisture before collecting the images. Each patty was placed on a silicone black tray and its entire surface was imaged. This ensured that most of the variation within one sample was recorded. Unique images were obtained for each patty formulation (category), resulting in 200 images per patty category. Throughout this study, the patties were prepared and imaged under the same controlled conditions. The raw spectral images (R 0 ) were corrected using a radiometric calibration. The grey and black reference images were used to calculate a relative reflectance image (R) using the following equation: where R is the corrected reflectance value, R 0 is the raw irradiance value, D (0% reflectance) the dark reference image acquired with the light source off and covering the camera lens with its opaque cap, and W (50% reflectance) the white reference image obtained from a grey Teflon calibration tile, subjected to the same lighting conditions as the samples. The corrected reflectance spectra were then converted to pseudo-absorbance by taking the logarithm of the reflectance values (log 1/R). Radiometric calibration from irradiance to radiance to pseudo-absorbance were carried out using the HySpex Ground software v 4.1 (HySpex, Norsk Elektro Optikk, Norway).  The process of extracting important objects from an input image is referred to as image segmentation. This is extremely useful for identifying regions-of-interest (ROIs) in tested objects in the form of masks and extracting their spectral features.
The goal of segmenting the images was to separate the patties from the background and unwanted pixels. A threshold value was determined based on the pixel intensity values between the patty samples and the image background to isolate the patties from the background. The hyperspectral image was reduced to a binary mask image by thresholding the image at 952 nm (at this wavelength, images provided good contrast between the patty sample and the background) with a value of 1.1. Isolated continuous regions of pixels with similar intensity levels were identified as the objects. In this step, each pixel value was then replaced with either a 0 or 1, where the 0 indicated the background (non-object pixels) and the value 1 the ROI (potential object pixels, i.e., patty), resulting in a binary image or image mask ( Figure 5). The background of the original hyperspectral images was removed by multiplying the mask across the hypercube along the λ-dimension. Once only the foreground (ROI) of the images remained, it was used to extract the spectral data from the images. The mean absorbance spectrum for each patty was calculated by averaging the spectra of all pixels within the ROI at each wavelength, in order to perform object-wise analyses.
2.6.2. Image Segmentation (Cleaning) and Extraction of Spectral Data (Mean Spectrum Calculation) The process of extracting important objects from an input image is referred to as image segmentation. This is extremely useful for identifying regions-of-interest (ROIs) in tested objects in the form of masks and extracting their spectral features.
The goal of segmenting the images was to separate the patties from the background and unwanted pixels. A threshold value was determined based on the pixel intensity values between the patty samples and the image background to isolate the patties from the background. The hyperspectral image was reduced to a binary mask image by thresholding the image at 952 nm (at this wavelength, images provided good contrast between the patty sample and the background) with a value of 1.1. Isolated continuous regions of pixels with similar intensity levels were identified as the objects. In this step, each pixel value was then replaced with either a 0 or 1, where the 0 indicated the background (non-object pixels) and the value 1 the ROI (potential object pixels, i.e., patty), resulting in a binary image or image mask ( Figure 5). The background of the original hyperspectral images was removed by multiplying the mask across the hypercube along the λ-dimension. Once only the foreground (ROI) of the images remained, it was used to extract the spectral data from the images. The mean absorbance spectrum for each patty was calculated by averaging the spectra of all pixels within the ROI at each wavelength, in order to perform objectwise analyses.
A spectral matrix was created by combining the extracted spectrum of each sample (patty). In this study, a spectral matrix was created for the identification of the patty categories (800 patty samples as rows and 288 wavebands as columns). Classes for each sample in the matrix was assigned and labelled according to the patty category, e.g., Patty 1, Patty 2, Patty 3 and Patty 4. The mean spectra of each class set (patty category) was computed between 952 and 2517 nm and plotted to investigate, determine and compare the chemical properties.  A spectral matrix was created by combining the extracted spectrum of each sample (patty). In this study, a spectral matrix was created for the identification of the patty categories (800 patty samples as rows and 288 wavebands as columns). Classes for each sample in the matrix was assigned and labelled according to the patty category, e.g., Patty 1, Patty 2, Patty 3 and Patty 4. The mean spectra of each class set (patty category) was computed between 952 and 2517 nm and plotted to investigate, determine and compare the chemical properties.

Pre-Processing
The spectral matrix was subjected to the following pre-processing techniques: (1) mean-centring, (2) standard normal variate (SNV); and (3) SNV + detrending (DT) [28]. These were applied to the spectral data to remove the scattering effects by centring and scaling each individual spectrum (SNV), as well as reduce the baseline shift and curvature (DT). These pre-processing techniques were evaluated in combination with different chemometric-and machine learning algorithms to determine the combination that yield the best results.

Principal Component Analysis
Principal component analysis (PCA) [29] was performed on the mean-centred absorbance spectra. The PCA models were calculated with a maximum of four principal components (PCs) to ensure consistency in the analysis. Subsequently, the PC scores plots and influence plots were used to detect and identify outliers. These outlier samples were removed from the dataset, and the PCA models were recalculated to further explore the data. In addition to outlier detection, the scores and loading line plots were used to visualise clusters of classes and identify important wavelengths, respectively.

Model Development, Calibration and Validation
Prior to model development, the following step was to randomly split the data into a calibration (training) and validation (test) set. Approximately 70% of the original data set was used for training and 30% for testing. The calibration models were calculated with leave-one-out cross-validation (CV), where each sample was left out of the calibration set once and subsequently predicted. Afterwards, the test set samples were predicted to evaluate the performance of the models. The cross-validation and performance measures were also used to evaluate the models as well as identify which pre-treatment/algorithm combination would be most suitable for the present dataset.
A grid-search (GridSearchCV) was used to find the best hyperparameters for each model. GridSearchCV performs an exhaustive search across all the classification algorithms' specified parameter values. The parameters for a specific algorithm varied according to the pre-processing technique used and the categories/classes examined.

Performance Measures
The overall performance of the individual models with the respective pre-processing techniques, were evaluated by performing the following calculations. The classification accuracy (Equation (2)) illustrates the efficacy of the overall model. The false positive error (Equation (3)) describes the misclassification of a negative response (incorrect class) as a positive response (correct class), while the false negative error (Equation (4)) describes the misclassification of a positive response (correct class) as a negative response (incorrect class). The sensitivity (Equation (5)), specificity (Equation (6)) and precision (Equation (7)) were calculated to assess the performance of models for a single class. The sensitivity (true positive rate) and specificity (true negative rate), describes the probability that a correct class and an incorrect class would be correctly classified, respectively, whereas the misclassification rate (classification error) (Equation (8)) describes how often the prediction was incorrect. The precision describes the predictive power of the model by calculating the predicted value for each class.
False negative error (%) = A classification accuracy of 75% is considered high in NIR-HSI applications and acceptable for authentication and adulterant identification. However, South African regulations [10] (Table S3) does not allow for any adulterated or mis-labelled raw processed meat products (e.g., raw burger, raw patty, raw hamburger patty) to enter the supply chain and states that: "Anyone who violates or fails to comply with the provisions of these regulations commits an offense and is subject to a fine or imprisonment if convicted" [10]. Therefore, classification accuracies of 100% will be required/mandatory. However, due to this being a preliminary study and the first of its kind in South Africa, the models with accuracies ≥90% were deemed to be acceptable limits for classification of the patty categories.

Patty Category Determination
This data set consisted of 800 patty samples of four patty categories [Patty 1 (ground burger/patty), Patty 2 (burger patty), Patty 3 (value burger/patty), Patty 4 (econo/budget burger)], with multiple treatments per category (200 patties). The aim of this study was to distinguish between the four patty categories, irrespective of the treatments or the authenticity status.

Spectral Analysis
The mean spectrum of each patty category is shown in Figure 6. The mean spectra of the four patty categories followed a similar trend and exhibited five broad absorption bands at 1198, 1460, 1738, 1934 and 2326 nm. The bands at 1198 nm (C-H stretch 2nd overtone) and 1738 nm (C-H stretch 1st overtone) indicate the presence of fat as specified by Osborne et al. [37] and Murray [38], respectively. The bands observed at 1460 nm (O-H stretch 2nd overtone) [39] and 1934 nm (O-H stretch + O-H deformation) [40] are associated with water and could be related to the moisture content of the samples. Lastly, the band at 2326 nm represents the C-H stretch + C-H deformation related to the CH 2 group [37] and can be associated with protein as specified by Downey and Beauchêne [41].
band at 2326 nm represents the C-H stretch + C-H deformation related to the CH2 group [37] and can be associated with protein as specified by Downey and Beauchêne [41].
The mean spectra of the four patty categories clearly exhibited differences in the absorbance values between all four patty categories at the bands associated with moisture, fat and protein. These differences correlate with the variances observed in the proximate chemical composition analysis results (Table S4) and therefore supports the spectral interpretation of the patties.  Figure 7 shows the PCA score plots for the patty categories analysed with the first three PCs. The score plots show clusters with minimal/gradual separation between the different classes. The lack of separation indicates similarity in their spectral signatures; however, there are numerous aspects (e.g., physical and chemical) that could differ, leading to slight spectral differences [42]. These differences are most likely associated with the characteristics and the fluctuations of the macronutrient composition within the patties. The mean spectra of the four patty categories clearly exhibited differences in the absorbance values between all four patty categories at the bands associated with moisture, fat and protein. These differences correlate with the variances observed in the proximate chemical composition analysis results (Table S4) and therefore supports the spectral interpretation of the patties. Figure 7 shows the PCA score plots for the patty categories analysed with the first three PCs. The score plots show clusters with minimal/gradual separation between the different classes. The lack of separation indicates similarity in their spectral signatures; however, there are numerous aspects (e.g., physical and chemical) that could differ, leading to slight spectral differences [42]. These differences are most likely associated with the characteristics and the fluctuations of the macronutrient composition within the patties. Moisture, fat and protein could explain the minimal/gradual separation between the different patty categories. PC1 explains 83.03% of the total variance (Figure 7), while PC2 and PC3 explains 16.55% and 0.33%, respectively. The loadings (Figure 8) exhibited the variables which are important to a given PC. The highest interpretable loading on PC1 and PC2 were found around 2294 nm (N-H stretch + C=O stretch) related to the amino acids found in protein [37]. PC3 had interpretable loadings at 1138 nm (C-H stretch 2nd overtone), 1318 nm, 1460-1580 nm (N-H stretch 1st overtone), 1660-1830 nm (C-H stretch 1st overtone), 1945 nm (O-H stretch + O-H deformation) and 2294 nm. These bands are associated with fat from beef (1138 nm) [43], protein (1318 nm; 1460-1580 nm) [37,44], fat (1160-1830 nm) [37,39], moisture (1945 nm) [37] and protein (2294 nm) [37]. In other words, Patty 4 (P4) separates from P1, P2, P3 in the direction of PC1 and PC2. Additionally, based on the loadings, this would be due to the difference in protein. P1 and P4 separate in the direction of PC3. By interpreting the loadings and comparing it to the chemical composition analysis results, it was concluded that the separation was due to P1 being higher in fat and protein and P4 higher in moisture. These results agreed with the proximate chemical composition analysis (Table S4) and it is therefore evident from the average moisture-, fat-and protein content of the patties, that these constituents were responsible for the minimal/gradual separation between the classes. [37,39], moisture (1945 nm) [37] and protein (2294 nm) [37]. In other words, Patty 4 (P4) separates from P1, P2, P3 in the direction of PC1 and PC2. Additionally, based on the loadings, this would be due to the difference in protein. P1 and P4 separate in the direction of PC3. By interpreting the loadings and comparing it to the chemical composition analysis results, it was concluded that the separation was due to P1 being higher in fat and protein and P4 higher in moisture. These results agreed with the proximate chemical composition analysis (Table S4) and it is therefore evident from the average moisture-, fat-and protein content of the patties, that these constituents were responsible for the minimal/gradual separation between the classes.  separates from P1, P2, P3 in the direction of PC1 and PC2. Additionally, based on the loadings, this would be due to the difference in protein. P1 and P4 separate in the direction of PC3. By interpreting the loadings and comparing it to the chemical composition analysis results, it was concluded that the separation was due to P1 being higher in fat and protein and P4 higher in moisture. These results agreed with the proximate chemical composition analysis (Table S4) and it is therefore evident from the average moisture-, fat-and protein content of the patties, that these constituents were responsible for the minimal/gradual separation between the classes.  Trees, RF, SVM-C (rbf, linear, poly, sigmoid) and PLS-DA models were calculated to distinguish between the four patty categories (P1-P4). These were done by evaluating the performance of two pre-processing techniques (SNV and SNV+DT) ( Table 5).

Principal Component Analysis
The SNV and SNV+DT pre-processing techniques gave similar results for the specific chemometric-and machine learning algorithms investigated (Table 5). After investigation, it was concluded that the LDA and SVM-C (rbf, linear, poly) models yielded the best results and could effectively distinguish the four patty categories from one another with accuracies ≥97%. The LDA and SVM-C models all achieved classification accuracies of 100% (calibration), 98-99% (cross-validation) and 97-98% (validation), thus indicating that the models were not over-fitted [45] and effective when predicting the individual patties. These models were then further investigated to determine their performance measures (Tables 6-8). Table 5. An overview of the accuracies for the developed models, pre-processed with SNV and SNV + detrend, to distinguish between the patty categories. The best predictions are indicated in bold.  Table 5. An overview of the accuracies for the developed models, pre-processed with SNV and SNV + detrend, to distinguish between the patty categories. The best predictions are indicated in bold.  The SNV and SNV+DT pre-processing techniques gave similar results for the specific chemometric-and machine learning algorithms investigated (Table 5). After investigation, it was concluded that the LDA and SVM-C (rbf, linear, poly) models yielded the best results and could effectively distinguish the four patty categories from one another with accuracies ≥97%. The LDA and SVM-C models all achieved classification accuracies of 100% (calibration), 98-99% (cross-validation) and 97-98% (validation), thus indicating that the models were not over-fitted [45] and effective when predicting the individual patties. These models were then further investigated to determine their performance measures (Tables 6-8). ble 5). Table 5. An overview of the accuracies for the developed models, pre-processed with SNV and SNV + detrend, to distinguish between the patty categories. The best predictions are indicated in bold.  The SNV and SNV+DT pre-processing techniques gave similar results for the specifi chemometric-and machine learning algorithms investigated (Table 5). After investiga tion, it was concluded that the LDA and SVM-C (rbf, linear, poly) models yielded the bes results and could effectively distinguish the four patty categories from one another with accuracies ≥97%. The LDA and SVM-C models all achieved classification accuracies o 100% (calibration), 98-99% (cross-validation) and 97-98% (validation), thus indicating that the models were not over-fitted [45] and effective when predicting the individua patties. These models were then further investigated to determine their performance measures (Tables 6-8).   The SNV and SNV+DT pre-processing techniques gave similar results for chemometric-and machine learning algorithms investigated (Table 5). After tion, it was concluded that the LDA and SVM-C (rbf, linear, poly) models yield results and could effectively distinguish the four patty categories from one an accuracies ≥97%. The LDA and SVM-C models all achieved classification ac 100% (calibration), 98-99% (cross-validation) and 97-98% (validation), thus that the models were not over-fitted [45] and effective when predicting the patties. These models were then further investigated to determine their pe measures (Tables 6-8).   The SNV and SNV+DT pre-processing techniques gave similar chemometric-and machine learning algorithms investigated (Tab tion, it was concluded that the LDA and SVM-C (rbf, linear, poly) m results and could effectively distinguish the four patty categories fr accuracies ≥97%. The LDA and SVM-C models all achieved classi 100% (calibration), 98-99% (cross-validation) and 97-98% (valida that the models were not over-fitted [45] and effective when pred patties. These models were then further investigated to determin measures (Tables 6-8).  Trees, RF, SVM-C (rbf, linear, poly, sigmoid) and PLS-DA models were calculated to distinguish between the four patty categories (P1-P4). These were done by evaluating the performance of two pre-processing techniques (SNV and SNV+DT) (Table 5). Table 5. An overview of the accuracies for the developed models, pre-processed with SNV and SNV + detrend, to distinguish between the patty categories. The best predictions are indicated in bold. The SNV and SNV+DT pre-processing techniques gave similar results for the specific chemometric-and machine learning algorithms investigated (Table 5). After investigation, it was concluded that the LDA and SVM-C (rbf, linear, poly) models yielded the best results and could effectively distinguish the four patty categories from one another with accuracies ≥97%. The LDA and SVM-C models all achieved classification accuracies of 100% (calibration), 98-99% (cross-validation) and 97-98% (validation), thus indicating that the models were not over-fitted [45] and effective when predicting the individual patties. These models were then further investigated to determine their performance measures (Tables 6-8).  Trees, RF, SVM-C (rbf, linear, poly, sigmoid) and PLS-DA model were calculated to distinguish between the four patty categories (P1-P4). These were done by evaluating the performance of two pre-processing techniques (SNV and SNV+DT) (Ta ble 5). Table 5. An overview of the accuracies for the developed models, pre-processed with SNV and SNV + detrend, to distinguish between the patty categories. The best predictions are indicated in bold. The SNV and SNV+DT pre-processing techniques gave similar results for the specifi chemometric-and machine learning algorithms investigated (Table 5). After investiga tion, it was concluded that the LDA and SVM-C (rbf, linear, poly) models yielded the bes results and could effectively distinguish the four patty categories from one another with accuracies ≥97%. The LDA and SVM-C models all achieved classification accuracies o 100% (calibration), 98-99% (cross-validation) and 97-98% (validation), thus indicating that the models were not over-fitted [45] and effective when predicting the individua patties. These models were then further investigated to determine their performance measures (Tables 6-8).  Trees, RF, SVM-C (rbf, linear, poly, sigmoid) and PLS-D were calculated to distinguish between the four patty categories (P1-P4). These by evaluating the performance of two pre-processing techniques (SNV and SN ble 5). Table 5. An overview of the accuracies for the developed models, pre-processed with SN + detrend, to distinguish between the patty categories. The best predictions are indicate The SNV and SNV+DT pre-processing techniques gave similar results for chemometric-and machine learning algorithms investigated (Table 5). After tion, it was concluded that the LDA and SVM-C (rbf, linear, poly) models yield results and could effectively distinguish the four patty categories from one an accuracies ≥97%. The LDA and SVM-C models all achieved classification ac 100% (calibration), 98-99% (cross-validation) and 97-98% (validation), thus that the models were not over-fitted [45] and effective when predicting the patties. These models were then further investigated to determine their pe measures (Tables 6-8).  Trees, RF, SVM-C (rbf, linear, poly, sigmoid) were calculated to distinguish between the four patty categories (P1by evaluating the performance of two pre-processing techniques (SN ble 5). Table 5. An overview of the accuracies for the developed models, pre-proce + detrend, to distinguish between the patty categories. The best predictions The SNV and SNV+DT pre-processing techniques gave similar chemometric-and machine learning algorithms investigated (Tab tion, it was concluded that the LDA and SVM-C (rbf, linear, poly) m results and could effectively distinguish the four patty categories fr accuracies ≥97%. The LDA and SVM-C models all achieved classi 100% (calibration), 98-99% (cross-validation) and 97-98% (valida that the models were not over-fitted [45] and effective when pred patties. These models were then further investigated to determin measures (Tables 6-8).     The performance measures exhibit how well the models classified each individual class. The classification of each class was high with the models being capable of distinguishing between the classes with accuracies ranging from 98.3 to 99.2% (Table 6). Due to the models achieving very similar performance measures, the mean performance values for the LDA (Table 7) and each SVM-C (rbf, linear, poly) model (Table 8) was calculated to simplify the interpretation. The mean performance measures for the LDA model (Table 7), shows that it was capable of distinguishing between each patty category with classification accuracies above 98%. The increased sensitivity (100%) and specificity (99.2%) for P4, indicates that the model has a high probability of correctly classifying Patty 4, thus resulting in a classification accuracy of 99.4%. The slightly lower sensitivity for P1 (97.7%), P2 (95.9%) and P3 (95.4%), reveals that the model was to some extent less suited for predicting these three patty classes. This phenomenon can be explained by referring to the calculation of the LDA model. LDA models are constructed using the PC scores, where the algorithm calculates an optimal linear projection between the classes, while keeping the variance within a class to a minimum [30]. Objects (patties) are classified by calculating the distance to the centre of each class. The objects are then assigned to a class to which it has the shortest distance. The scores plot (Figure 7) illustrates an overlap between the samples of Patty 1, Patty 2 and Patty 3, with a slightly larger number of Patty 3 samples displaying a close distance to the samples of Patty 1 and Patty 2. Therefore, Patty 3 resulted in a lower classification accuracy as these patties were assigned to the predominant class, Patty 1 and/or Patty 2. Hence, explaining why the model is slightly less suited for predicting the Patty 3 category samples.

Pre-Processing
The overall mean classification accuracies for the individual patty classes of the SVM-C models for the rbf (>97.1%), linear (>98.5%) and polynomial (>98.1%) kernel functions, suggested that the model calculated with the linear kernel would provide slightly better classification (Table 8). A kernel is a similarity function which calculates how similar two inputs are and is used for separating hyperplanes, to evaluate the robustness of the classifier model. SVM is a maximal margin classifier where the algorithm aims to separate the different categories/classes in a dataset by placing a separating line in the middle of the margin. The empty space between the boundaries, known as the maximum margin or optimal margin hyperplane, indicates the maximum separation between two groups. The data points touching the boundary of the margin are known as the support vectors [46], which serve as training patterns and convey all pertinent information regarding the classification problem [47]. In the end, SVM is a distance-based approach which calculates the optimal distance between data points and the hyperplane [46].
The mean performance measures for the SVM-C (linear) model given in Table 8, shows that the P4 class achieved the highest mean classification accuracy (99.8%), followed by P2 (99.0%), P3 (99.0%) and P1 (98.5%). The model also revealed that the classification of each class was nearly perfect, with the sensitivity (>97.0%), specificity (>98.3%) and precision (>95.5%) confirming this observation. The slightly lower classification accuracy of P1, P2 and P3 was ascribed to the increased false positive-and false negative error due to objects being misclassified. The lack of separation can be attributed to the patties' spectral similarities which corresponds to their similar proximate chemical composition results as reported in Table S4. The sensitivity and specificity of Patty 1 was 99.2% and 98.3%, respectively. This suggests that the model is equally sensitive for predicting a true positive (Patty 1) as correct, as it is specific when predicting a true negative (Patty 2, Patty 3 and Patty 4) as correct. The sensitivity of Patty 2 and Patty 3 is 97.5% and 97.0%, respectively. In addition, the specificity is 99.4% and 99.1%, therefore revealing that the model is less sensitive and more specific when classifying these patties. The results for the SVM-C models given in Table 8, shows that this machine learning algorithm was capable of distinguishing between each patty category.
From both the LDA and the SVM-C (rbf, linear, poly) results it was evident that these models were capable of accurately distinguishing Patty 4 from the other three patties, with a misclassification rate below 0.6%. The PCA scores ( Figure 7) and loadings ( Figure 8) support these results, as a slight separation was observed between Patty 4 and the other three patties. The separation and correct classification of P4 was mainly attributed to the lower protein content, which accounted for the spectral differences. The protein content of Patty 4 (10.7%) was considerably different, compared to the other patties [Patty 1 (19.5%), Patty 2 (14.4%), Patty 3 (14.2%)], and a correlation was observed in the chemical results (Table S4). Therefore, the results illustrate that the above-mentioned models were able to predict the different patty classes, with relatively high accuracies, which deemed the models to be effective.
Although the D.Trees-and RF models both exhibited calibration accuracies of 100%, indicative of an effective model, the cross-validation [82-83% (D.Trees); 88-90% (RF)] and validation accuracies [74-76% (D.Trees); 83-93% (RF)] decreased (Table 5). This indicated that the models were over-fitted [45] and therefore not effective. Over-fitting is a known drawback of decision trees, especially when dealing with many features. This dataset consisted of 288 features (wavebands), therefore explaining the tendency to overfit the models. A decision tree is a flowchart-like tree structure where the basic development involves the splitting of the predictor space, using recursive binary splitting, into a number of simple regions for all the possible outcomes [48]. The binary splits are made using the classification error rate, which are the number of training observations in that given region not belonging to the most common class. An unknown observation in a given region is assigned to the most common class in that region. Although decision trees are simple and useful for interpretation, they lack in predictive accuracy compared to other supervised learning approaches [48]. Hence, the prediction accuracy can substantially be improved by producing multiple trees and aggregating them using a method like random forests. This phenomenon was supported by the improved cross-validation and validation results observed in Table 5 for the RF models. Although, random forests is a collective of decision trees, how the trees are constructed differ. For decision trees, each tree-node is selected using all the features to gain the maximum amount of information. While random forests only consider a small subset of features when constructing a node, thus resulting in some loss of interpretation [48]. This phenomenon could explain the improved classification performances of the RF models.
The KNN, SVM-C (sigmoid) and PLS-DA models exhibited unsatisfactory classification and discrimination results (Table 5). This can be explained by referring to how the individual models are computed. KNN is a distance-based, non-parametric discriminant classification method [49], performed on the PC scores. The distance between an unknown (test set) and a set of samples with a known class membership (training set) is calculated to classify the unknown. The closest K samples, determined by GridSearchCV, and majority voting were used to make a classification. Therefore, the classification of the individual patties was impaired as a result of the close distance and overlap between the four classes due to the minimal separation observed in the scores plot (Figure 7).
The SVM-C (sigmoid) model's decreased accuracy could be explained by the fact that the sigmoid kernels are less adaptable than rbf kernels, resulting in higher bias when computing the separating hyperplanes [46]. Optimal model performances can be achieved when the parameters penalty coefficient C and gamma are optimised using a grid-search procedure. The C value is an important parameter as it determines the size of the margins of the hyperplane and to what extent the model under-fits or over-fits the data. A large C will result in broad 'soft margins', with a high tolerance for observations violating the constraints. Consequently, the 'soft margin' allows for the misclassification of some training samples, thus resulting in a better overall model fit [50]. As C decreases, the tolerance for observations violating the constraints decreases, and the margins narrows, resulting in 'hard margins' and over-fit data. Both SVM-C (sigmoid) models had low C values, which could explain the reduced classification model performances.
The four-way PLS-DA model was constructed, using binary dummy variables, to indicate the presence or absence of a specific class during the modelling process. For example, a value of one was assigned if the spectrum belonged to the correct group, and a value of zero if it did not [42]. While achieving separations of two classes is known to be relatively easy [27], the results (Table 5) in the current study support the notion that the separation of multiple classes is challenging. Although many studies have utilised a single globally optimised model to discriminate between two, three or even four classes [27], the development of such models is not always straight forward. This type of multiclass model approach is only possible once all the classes are fully separable using the chosen set of spectral features. This requirement is frequently not met, particularly when working with heterogeneous samples and closely related classes [27], as was observed in the scores plot (Figure 7). For this reason, the PLS-DA models exhibited unsatisfactory discrimination accuracies.

Conclusions
The South African meat industry currently relies on destructive and time-consuming techniques which require complex laboratory procedures to authenticate processed meat products. NIR hyperspectral imaging has been considered as an automated alternative to improve and modernise this process. Although the initial investment cost for a HISsystem is substantial (prices start at $28k USD) [51], the economic importance of meat and processed meat products as well as the wide-ranging benefits of improved authentication procedures could make up for this costly price tag. In addition, there will be savings on reagents, increased speed of analysis and overall higher throughput. A series of models, using various chemometric-and machine learning algorithms, were calculated in order to classify each object as either of the four patty categories. The SVM-C (linear) models distinguished the four patty categories with classification accuracies ≥98.5%.
This study sets the benchmark, as it is the first time that NIR-HSI was investigated for the rapid detection, classification and categorisation of beef patties based on the South African regulations and different ratios of water:protein:fat. NIR-HSI is a suitable ecofriendly technique and shows potential to rapidly and accurately distinguish between categories of processed beef burger patties consisting of various ingredients, species and adulterants. Additionally, NIR-HSI has excellent application prospects since this approach adheres to the fundamental principles of 'green science', reducing waste formation and not requiring the use of chemical reagents or solvents. Furthermore, this study has the potential of providing an alternative technique to the current AOAC-recommended conventional, manual, destructive and time-consuming methods, thus contributing to the authenticity and fair-trade of processed meat products locally and internationally. However, the South African legislation is strict, and the classification of the patty categories must be refined further before the meat sector considers spectral imaging. Therefore, the next step for this research would be to investigate the authentication and adulterant detection/quantification of the patties within each individual patty category.
Supplementary Materials: The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/s23020697/s1, Figure S1: A schematic of the experimental design, formulation and treatments for Patty 1 (ground burger/ground patty); Figure S2: A schematic of the experimental design, formulation and treatments for Patty 2 (burger/patty/hamburger patty/meatball/frikkadel); Figure S3: A schematic of the experimental design, formulation and treatments for Patty 3 (value burger/value patty/value hamburger/value meatball/value frikkadel/Any other similar name); Figure S4: A schematic of the experimental design, formulation and treatments for Patty 4 (economy burger/econo burger/economy patty/econo patty/budget burger/econo hamburger patty/budget hamburger patty/econo meatball/econo frikkadel/Any other similar name); Table S1: Minimum size of a primary sample; Table S2: Recommended methods of analysis; Table S3: Regulations regarding the classification and compositional specifications of raw beef patties; Table S4: An overview of the proximate chemical composition analysis results (means ± SD) for the moisture-, fat-and protein content (%) of the various patties.