Raman Imaging for the Detection of Adulterants in Paprika Powder: A Comparison of Data Analysis Methods

: Raman imaging requires the effective extraction of chemical information from the corresponding datasets, which can be achieved by a range of analytical methods. However, since each of these methods exhibits both strengths and weaknesses, we herein directly compare univariate, bivariate, and multivariate analyses of Raman imaging data by evaluating their performance in the quantitation of two adulterants in paprika powder. Univariate and bivariate models were developed based on the spectral features of the target adulterants, whereas spectral angle mapper (SAM), adopted as a multivariate analysis method, utilized the complete dataset. The obtained results demonstrate that despite being simple and easily implementable, the univariate method affords false positive pixels in the presence of background noise. Luckily, the above problem can be easily resolved using the bivariate method, which utilizes the multiplication of two band images wherein the same adulterant shows high-intensity peaks exhibiting the least overlap with those of other sample constituents. Finally, images produced by SAM contain abundant false negative pixels of adulterants, particularly for low-concentration samples. Notably, the bivariate method affords results closely matching the theoretical adulterant content, exhibiting the advantages of using non-complex data (only two bands are utilized) and being well suited to online applications of Raman imaging in the agro-food sector.


Introduction
Raman spectroscopy coupled with chemical imaging finds numerous applications in the food and pharmaceutical industries, combining both spectral and spatial information and thus allowing the simultaneous identification and localization of chemical species. Since the chemical properties and distribution of species usually influence the quality of both food and pharmaceutical samples, the above technique can be used to generate chemical maps showing the distributions of certain parameters of interest [1]. This unique feature of Raman imaging has made it a popular method of pharmaceutical quality analysis from the initial step of preparing API-excipient solid dispersions (e.g., by powder blending) through the manufacturing process until the final product fabrication step [2,3]. Although Raman imaging has also been used for food quality analysis, its full-scale potential for food quality and authenticity evaluation as well as the selection of appropriate processing methods for imaging data are yet to be optimized. The applications of Raman imaging for agro-food product evaluation are reviewed in reference [1,4].
Up to now, Raman imaging has been performed using confocal imaging systems that collect a large number of spectra at the desired sample positions, which is known as the point-scan method. Recently, macro-scale, line-scan Raman imaging for high-throughput screening, pioneered by Qin et al. [5], has been established and used for a range of applications related to agro-food quality analysis, in particular determining the authenticity of powdered foods. The above method uses a several-centimeters-wide laser line to illuminate the sample and a relatively large charge-couple device (CCD) detector to generate the corresponding Raman chemical images. Since the laser line can illuminate a large sample area, and the scattered Raman signal can be mapped by the CCD, one sample dimension can be scanned at a time, allowing faster analysis.
As the hyperspectral imaging technique acquires spectral information from each pixel unit, it can yield detailed sample information on the scale of a single pixel [6]. The numeric format of line-scan Raman imaging data corresponds to a 3D hypercube in which intensity values are functions of two spatial dimensions and one spectral dimension. Since macro-scale Raman imaging requires the effective extraction of information from this hypercube, the analysis of collected data is very important for evaluating the spatial distribution of a given constituent. The choice of an appropriate data analysis method allows the visualization of sample biochemical constituents separated into particular image regions, with methods for preprocessed Raman hypercube analysis broadly classified into univariate, bivariate, and multivariate methods.
The simplest and most convenient approach is the univariate method, which considers only one Raman peak and thus presents information on only one characteristic functional group. Since Raman spectra comprise numerous peaks, the intensity of a single band can be utilized to map the component of interest by plotting unique frequencies as functions of spatial position and spectral intensity. However, the key requirement of univariate imaging is the precise spectral characterization of the sample (i.e., adulterant materials) prior to imaging, as the uniquely assignable wavenumbers should be known in advance [7]. In food authenticity analysis, the unique peaks of potential adulterants are either known or determined experimentally and thus allow single-band imaging of the target chemical, as exemplified by the successful application of Raman mapping in chocolate analysis [8], detection of melamine in milk powder [9,10], and benzoyl peroxide in wheat flour [11].
When the target chemical (adulterant) does not show a relatively high-intensity band or this band partially overlaps with the Raman peaks of other components, imaging can be performed using bivariate analysis. This method utilizes two data points such as two Raman peaks of the same component, using the summation or multiplication of the two selected bands to improve the signal-to-noise ratio and thus generate the required Raman image. Since most studies utilizing Raman imaging for the detection of food adulteration focus on a single adulterant showing intense Raman peaks not overlapping with those of the background material, bivariate analysis has not yet been used for agro-food quality analysis.
In contrast to the above methods, multivariate analysis employs full spectra and can thus be adopted when selective information is not available [12]. Although the application of this method for both qualitative and quantitative imaging has been reviewed by [1,13], most of the provided examples correspond to pharmaceutical and biomedical domains. Only few investigations apply multivariate analysis for Raman imaging of agro-food products, probably because the unique Raman peaks of adulterants allow the application of the univariate method. However, multivariate analysis methods are highly suggestive when the nature of adulterants is unknown or when the sample is suspected to contain multiple adulterants, particularly if one wants to evade target-based conventional methods that can only deal with one kind of adulterant at a time. The above methods are particularly useful for extracting pure component spectra by means of classical least square analysis, self-modeling mixture analysis (SMA), independent component analysis, etc. Another option is the development of a spectral library of potential adulterants for a particular food product and the utilization of spectral similarity analysis to compare individual adulterant spectra with those of collected pixels. Qin et al. [14] and Dhakal et al. [15] used an SMA algorithm to extract the Raman signature of adulterants added to milk powder and create Raman chemical images of these samples, revealing a strong correlation between the predicted and SMA imaging-based adulterant contents.
Very few studies compare univariate and multivariate analysis methods, particularly in the case of pharmaceutical [7] and biomedical applications [16]. To the best of our knowledge, no comparison of analysis methods for the quantitative characterization of food material Raman imaging data has yet been reported. Thus, since Raman imaging-based food quality and authenticity analysis is gaining popularity, and food samples are different (e.g., due to exhibiting higher fluorescence backgrounds) from pharmaceutical and biological materials, a comprehensive comparison of data analysis methods is urgently required. Therefore, we herein aimed to quantitatively compare the performances of univariate, bivariate, and multivariate analyses of imaging data, using these methods to generate Raman chemical images of two adulterants (Sudan-I and Congo Red dyes) and thus facilitating the visualization of their spatial distribution and quantifying their concentration. Since in this study, a single data set was practically tested, the findings of this study are legitimate for the reported dataset, and not necessarily for all Raman imaging data.

Sample Preparation
Paprika powder was procured from a local market in Korea, and two adulterants (Sudan-I and Congo Red dyes) with purities of >95% were purchased from Sigma-Aldrich (St. Louis, MO, USA). The above dyes were selected as adulterants due to being frequently used as food colorants to enhance the appearance of paprika or chili powder [17]. Appropriate amounts of adulterants were added to the paprika powder to obtain four adulterant-spiked samples (0.1, 0.25, 0.5, and 0.75 wt %), e.g., the 0.1 wt % sample contained 0.1 wt % Sudan-I, 0.1 wt % Congo Red, and 99.8 wt % paprika powder. The selected adulterant concentration range was based on a previous report [18]. The spiked samples were loaded into a centrifuge tube and subjected to 5-min vertex mixing. For Raman imaging, each sample was packed into a custom-designed, aluminum-plated sample holder with interior dimensions of 40 mm × 40 mm × 3 mm, and the powder surface was leveled with the upper edge of the sample holder using a spatula.

Instrumentation and Data Collection
A schematic representation of the utilized line-scan Raman imaging system is shown in Figure 1, with excitation performed using a custom-designed diode near-infrared laser system (OptiGrate Corp., Oviedo, FL, USA) combining the laser beam from 19 emitters to produce a relatively high-intensity laser line. The laser beam emanating from the laser box was passed through a cylindrical lens (f = 200) and an engineered diffuser (ED1-L4100; ThorLabs, Hans-Boeckler, Dachau, Germany) mounted next to the above lens to obtain homogeneous laser intensity at each point of the laser line. The generated laser line was then projected onto a 785-nm beam splitter placed at 45 • to reflect the laser beam onto the sample surface and act as a filter to mitigate Rayleigh scattering generated during Raman data collection. The dimension of the generated laser line on the sample surface was approximately 1.5 mm × 16 mm and laser power~450 mW was measured by a digital power meter (PM100D, ThorLabs, Germany).
The generated Raman signals were passed through the beam splitter to the sensing unit, which comprised an imaging spectrograph and a CCD camera. An objective lens with a focal length of 22 mm was mounted onto the spectrograph for focus and aperture adjustment. The generated Raman signals were further filtered to remove Rayleigh-scattered photons using two 785-nm long-pass filters and subsequently entered the imaging spectrograph via a slit. A 16-bit CCD camera (iKon-M 934, Andor Technology, South Windsor, CT, USA) was placed in the focus plane of the spectrometer to collect the dispersed signals and create images. During image collection, the CCD was cooled to -65 • C to reduce the dark current effect. The camera was connected to a computer via USB cables for control and data transfer. The whole system, with the exception of the computer, was installed in a black box to exclude the influence of surrounding light. The laser-line uniformity on sample surface was measured by scanning a Teflon sheet and was about 14 cm.
The sample holder packed with a given adulterated paprika powder sample was placed on a conveyor belt for line-by-line scanning. In addition to adulterant-containing samples, samples of pure paprika powder and pure adulterants were also prepared as references. To improve the signal-to-noise ratio, the exposure time was set to 1 s and a total of 270 steps, and a step size of 0.15 mm/scan was selected to cover the spatial shape of the sample. Raman data were collected in the wavelength range of 740-1000 nm (corresponding to Raman shifts of −763 to 2837 cm −1 ) and at a CCD spatial binning of two. The generated Raman images were saved in ENVI format as a 3D hypercube. The dark reference was obtained with the laser turned off and the camera lens covered by a cap, with the obtained dark current subtracted from each collected hypercube to obtain corrected data for further analysis. The sample holder packed with a given adulterated paprika powder sample was placed on a conveyor belt for line-by-line scanning. In addition to adulterant-containing samples, samples of pure paprika powder and pure adulterants were also prepared as references. To improve the signal-tonoise ratio, the exposure time was set to 1 s and a total of 270 steps, and a step size of 0.15 mm/scan was selected to cover the spatial shape of the sample. Raman data were collected in the wavelength range of 740-1000 nm (corresponding to Raman shifts of −763 to 2837 cm −1 ) and at a CCD spatial binning of two. The generated Raman images were saved in ENVI format as a 3D hypercube. The dark reference was obtained with the laser turned off and the camera lens covered by a cap, with the obtained dark current subtracted from each collected hypercube to obtain corrected data for further analysis.

Preprocessing
The large size of Raman hypercubes generated for each sample leads to computational complexity and hinders efficient data processing. Therefore, the data size was reduced in the spatial direction (by selecting the spatial region of interest) and in the spectral direction (by examining the spectral features of adulterants and keeping only the informative spectral range). The fluorescence signals are usually generated during laser-sample interaction in the Raman measurement, particularly for food materials, which may surpass the informative information. Therefore, the reformatted Raman hypercube was first unfolded to a 2D dataset and the fluorescence background was removed from the spectrum of each pixel using an adaptive, iteratively reweighted, penalized least square (airPLS) method [19]. This method fit the spectrum and subtracts the fitted baseline from the original spectrum to acquire a fluorescence-free spectrum. Moreover, a median filter with a 3 × 3 moving window was used to remove high-frequency noise. The corrected 2D dataset was further refolded to a 3D one to generate univariate and bivariate images.

Univariate and Bivariate Analyses
The univariate method was employed by selecting a single-band image for each adulterant based on one of its non-overlapping and intense Raman peaks. In another approach, the bivariate

Preprocessing
The large size of Raman hypercubes generated for each sample leads to computational complexity and hinders efficient data processing. Therefore, the data size was reduced in the spatial direction (by selecting the spatial region of interest) and in the spectral direction (by examining the spectral features of adulterants and keeping only the informative spectral range). The fluorescence signals are usually generated during laser-sample interaction in the Raman measurement, particularly for food materials, which may surpass the informative information. Therefore, the reformatted Raman hypercube was first unfolded to a 2D dataset and the fluorescence background was removed from the spectrum of each pixel using an adaptive, iteratively reweighted, penalized least square (airPLS) method [19]. This method fit the spectrum and subtracts the fitted baseline from the original spectrum to acquire a fluorescence-free spectrum. Moreover, a median filter with a 3 × 3 moving window was used to remove high-frequency noise. The corrected 2D dataset was further refolded to a 3D one to generate univariate and bivariate images.

Univariate and Bivariate Analyses
The univariate method was employed by selecting a single-band image for each adulterant based on one of its non-overlapping and intense Raman peaks. In another approach, the bivariate method was employed, featuring the multiplication of two different band intensities of the same adulterant. For this purpose, Raman images based on two different bands at which Sudan-I exhibits highly intense peaks, showing the lowest interference from other adulterants or food constituents, were multiplied together to generate a band multiplication image. The same methodology was applied to generate a band multiplication image for Congo Red. The selected band images and the bivariate images for each adulterant were subjected to image thresholding, i.e., a threshold value was applied to distinguish between adulterant pixels and food background. The threshold value for the univariate method was selected by comparing the adulterant and paprika powder peak intensities (peak height) at the selected waveband for each adulterant. In a similar way, the peak intensity of multiplied bands for both adulterants and paprika powder were compared to select a final threshold value for images generated using the bivariate method.

Multivariate Analysis
The above preprocessed data were also processed using the multivariate procedure. Herein, the pixels of adulterants and food powder were identified using spectral angle mapper (SAM) analysis, which relies on determining the similarity between reference (endmember) and target spectra [20]. Therefore, the Raman spectra of pure adulterants were used as endmembers, and their similarities to the spectra of pixels in the Raman images of adulterated paprika samples were calculated as follows: where a is the preprocessed 2D data (spectrum of each pixel), b is the reference spectrum of adulterant (Sudan-I or Congo Red), and n is the number of spectral bands used for calculation: The results from the SAM algorithms are displayed as a rule image, one for each reference class defined. They give information about the relevance of each pixel to a reference class. The darker the pixel, the more relevant it is to a particular class [21]. The output rule images had the same spatial dimensions as the single-band Raman images and were used for the quantitative visualization of adulterant particles after thresholding. The global threshold value for the SAM rule image calculated for Sudan-I was selected by determining the angle between the Raman spectra of each pixel in pure paprika powder and that of Sudan-I, as well as the angle between each pixel in the image of pure Sudan-I and the mean spectrum of this dye. The threshold value was determined as the median value of the maximum angle of Sudan-I and the minimum angle of paprika powder. All image collection, correction, and analysis processes were programmed in MATLAB (MathWorks, Natick, MA, USA).

Raman Spectra and Image Processing
Initially, the Raman spectra of both adulterants were visually evaluated to find informative wavenumber ranges (360-1800 cm −1 in both cases) and thus reduce the hypercube dimension in the spectral domain region. Figure 2 shows the fluorescence-corrected Raman signature of paprika powder and both adulterants in the selected spectral range, revealing that the adulterant spectra featured numerous peaks, with most of them overlapping with signals of other adulterants or paprika powder. Fluorescence correction and de-noising of Raman signals were performed using adaptive iteratively reweighted penalized least squares (air PLS) and median filter methods, with the band images for Sudan-I and Congo Red shown in Figure 3. Based on the original images, one can easily observe that the paprika powder background was very intense, with the occasionally occurring spikes (cosmic ray effect) suppressing relevant information and making both band images representing two different adulterants look identical. However, hypercube preprocessing nullified the background effect and high-frequency noise, enhancing the Raman images, with visual inspection of the corrected images facilitating the visualization of adulterant particle distribution.

Univariate and Bivariate Analyses
As a common practice, univariate imaging can be performed by selecting single-band image of the target chemical recorded using its most intense Raman peaks. The most intense Raman peak for Sudan-I was observed at 1594 cm −1 , but exhibited an overlap with a peak of Congo Red (1588 cm −1 ). On the other hand, the most intense peak of Congo Red observed at 1152 cm −1 overlapped with that of paprika powder (1156 cm −1 ). Therefore, univariate imaging was herein performed by selecting the band images for Sudan-I and Congo Red at 1227 and 1351 cm −1 , respectively, since these bands were the second most intense and exhibited the least interference with the Raman peaks of other components. A chemical map of each adulterant was created by plotting these unique wavebands as a function of spatial position and intensity (Figure 3). The inconsistent background in the original images was corrected to a more consistent one, allowing pixels representing adulterant particles to be clearly seen after preprocessing.
Since hypercube (image) correction is usually performed in the spectral domain (unfolded 2D data), the existence of noise in the spatial domain is obviously band-independent. Therefore, univariate (single-band) imaging may result in high-intensity noise pixels that can be classified as adulterant pixels after intensity-based image thresholding. A simple way of mitigating these effects is the application of the bivariate method, which involves the use of two data points. Importantly, noise is randomly distributed and is not related to the band position, whereas adulterant pixels in two band images of same adulterant have the same spatial position. Therefore, the multiplication of Fluorescence correction and de-noising of Raman signals were performed using adaptive iteratively reweighted penalized least squares (air PLS) and median filter methods, with the band images for Sudan-I and Congo Red shown in Figure 3. Based on the original images, one can easily observe that the paprika powder background was very intense, with the occasionally occurring spikes (cosmic ray effect) suppressing relevant information and making both band images representing two different adulterants look identical. However, hypercube preprocessing nullified the background effect and high-frequency noise, enhancing the Raman images, with visual inspection of the corrected images facilitating the visualization of adulterant particle distribution. Fluorescence correction and de-noising of Raman signals were performed using adaptive iteratively reweighted penalized least squares (air PLS) and median filter methods, with the band images for Sudan-I and Congo Red shown in Figure 3. Based on the original images, one can easily observe that the paprika powder background was very intense, with the occasionally occurring spikes (cosmic ray effect) suppressing relevant information and making both band images representing two different adulterants look identical. However, hypercube preprocessing nullified the background effect and high-frequency noise, enhancing the Raman images, with visual inspection of the corrected images facilitating the visualization of adulterant particle distribution.

Univariate and Bivariate Analyses
As a common practice, univariate imaging can be performed by selecting single-band image of the target chemical recorded using its most intense Raman peaks. The most intense Raman peak for Sudan-I was observed at 1594 cm −1 , but exhibited an overlap with a peak of Congo Red (1588 cm −1 ). On the other hand, the most intense peak of Congo Red observed at 1152 cm −1 overlapped with that of paprika powder (1156 cm −1 ). Therefore, univariate imaging was herein performed by selecting the band images for Sudan-I and Congo Red at 1227 and 1351 cm −1 , respectively, since these bands were the second most intense and exhibited the least interference with the Raman peaks of other components. A chemical map of each adulterant was created by plotting these unique wavebands as a function of spatial position and intensity (Figure 3). The inconsistent background in the original images was corrected to a more consistent one, allowing pixels representing adulterant particles to be clearly seen after preprocessing.
Since hypercube (image) correction is usually performed in the spectral domain (unfolded 2D data), the existence of noise in the spatial domain is obviously band-independent. Therefore, univariate (single-band) imaging may result in high-intensity noise pixels that can be classified as adulterant pixels after intensity-based image thresholding. A simple way of mitigating these effects is the application of the bivariate method, which involves the use of two data points. Importantly, noise is randomly distributed and is not related to the band position, whereas adulterant pixels in two band images of same adulterant have the same spatial position. Therefore, the multiplication of

Univariate and Bivariate Analyses
As a common practice, univariate imaging can be performed by selecting single-band image of the target chemical recorded using its most intense Raman peaks. The most intense Raman peak for Sudan-I was observed at 1594 cm −1 , but exhibited an overlap with a peak of Congo Red (1588 cm −1 ). On the other hand, the most intense peak of Congo Red observed at 1152 cm −1 overlapped with that of paprika powder (1156 cm −1 ). Therefore, univariate imaging was herein performed by selecting the band images for Sudan-I and Congo Red at 1227 and 1351 cm −1 , respectively, since these bands were the second most intense and exhibited the least interference with the Raman peaks of other components. A chemical map of each adulterant was created by plotting these unique wavebands as a function of spatial position and intensity (Figure 3). The inconsistent background in the original images was corrected to a more consistent one, allowing pixels representing adulterant particles to be clearly seen after preprocessing.
Since hypercube (image) correction is usually performed in the spectral domain (unfolded 2D data), the existence of noise in the spatial domain is obviously band-independent. Therefore, univariate (single-band) imaging may result in high-intensity noise pixels that can be classified as adulterant pixels after intensity-based image thresholding. A simple way of mitigating these effects is the application of the bivariate method, which involves the use of two data points. Importantly, noise is randomly distributed and is not related to the band position, whereas adulterant pixels in two band images of same adulterant have the same spatial position. Therefore, the multiplication of two bands representing the same adulterant obviously enhances only the pixels pertaining to the adulterant. Hence, we herein used the multiplication of two bands (i.e., 1227 and 1493 cm −1 for Sudan-I and 1351 and 1451 cm −1 for Congo Red) to enhance and segregate the adulterant particles from the paprika powder background and thus develop a quantitative model. The selected bands 1227 and 1493 cm −1 for Sudan-I are related to the δ(CH) and ν(CC) vibration modes, respectively [22] and the Raman bands 1351 and 1451 cm −1 selected for Congo Red represents CR bands and are related to azo-mode frequencies [23,24]. The result of the bivariate imaging (Figure 4) confirmed the enhancement of adulterants pixels and proved that the background noise could be significantly reduced, as is obvious from the comparison of the multiplication image with the single-band image for Congo Red. two bands representing the same adulterant obviously enhances only the pixels pertaining to the adulterant. Hence, we herein used the multiplication of two bands (i.e., 1227 and 1493 cm −1 for Sudan-I and 1351 and 1451 cm −1 for Congo Red) to enhance and segregate the adulterant particles from the paprika powder background and thus develop a quantitative model. The selected bands 1227 and 1493 cm −1 for Sudan-I are related to the δ(CH) and ν(CC) vibration modes, respectively [22] and the Raman bands 1351 and 1451 cm −1 selected for Congo Red represents CR bands and are related to azomode frequencies [23,24]. The result of the bivariate imaging (Figure 4) confirmed the enhancement of adulterants pixels and proved that the background noise could be significantly reduced, as is obvious from the comparison of the multiplication image with the single-band image for Congo Red.

Multivariate Analysis
Spectral angle mapper (SAM) is a widely used method for pixel-by-pixel hyperspectral image screening, requiring the assignment of a reference spectrum (endmember) to calculate the angle between reference spectra and the spectra of each pixel in the target image. Herein, individual SAM models were developed for each adulterant to identify adulterant particles in spiked samples, with the obtained images shown in Figure 5. The intensities of each pixel in the obtained rule images represent the SAM angles between the endmember and the spectrum extracted from a given hyperspectral pixel [21], with pixel darkness being proportionate to the similarity between these spectra. Paprika powder showed less dark pixels (larger angles) than those of the adulterants owing to the higher dissimilarity of the former and the reference spectrum. Thus, the dark pixels in Figure  5 represent adulterant particles, with the obtained results confirming the high potential of SAM for identifying and locating the spatial positions of individual adulterant particles based on the Raman characteristics of these adulterants. Since the rule images showed a distinct difference between adulterant and food powder pixels, the obtained images could be subjected to thresholding to develop a quantitative analysis model.

Multivariate Analysis
Spectral angle mapper (SAM) is a widely used method for pixel-by-pixel hyperspectral image screening, requiring the assignment of a reference spectrum (endmember) to calculate the angle between reference spectra and the spectra of each pixel in the target image. Herein, individual SAM models were developed for each adulterant to identify adulterant particles in spiked samples, with the obtained images shown in Figure 5. The intensities of each pixel in the obtained rule images represent the SAM angles between the endmember and the spectrum extracted from a given hyperspectral pixel [21], with pixel darkness being proportionate to the similarity between these spectra. Paprika powder showed less dark pixels (larger angles) than those of the adulterants owing to the higher dissimilarity of the former and the reference spectrum. Thus, the dark pixels in Figure 5 represent adulterant particles, with the obtained results confirming the high potential of SAM for identifying and locating the spatial positions of individual adulterant particles based on the Raman characteristics of these adulterants. Since the rule images showed a distinct difference between adulterant and food powder pixels, the obtained images could be subjected to thresholding to develop a quantitative analysis model. two bands representing the same adulterant obviously enhances only the pixels pertaining to the adulterant. Hence, we herein used the multiplication of two bands (i.e., 1227 and 1493 cm −1 for Sudan-I and 1351 and 1451 cm −1 for Congo Red) to enhance and segregate the adulterant particles from the paprika powder background and thus develop a quantitative model. The selected bands 1227 and 1493 cm −1 for Sudan-I are related to the δ(CH) and ν(CC) vibration modes, respectively [22] and the Raman bands 1351 and 1451 cm −1 selected for Congo Red represents CR bands and are related to azomode frequencies [23,24]. The result of the bivariate imaging (Figure 4) confirmed the enhancement of adulterants pixels and proved that the background noise could be significantly reduced, as is obvious from the comparison of the multiplication image with the single-band image for Congo Red.

Multivariate Analysis
Spectral angle mapper (SAM) is a widely used method for pixel-by-pixel hyperspectral image screening, requiring the assignment of a reference spectrum (endmember) to calculate the angle between reference spectra and the spectra of each pixel in the target image. Herein, individual SAM models were developed for each adulterant to identify adulterant particles in spiked samples, with the obtained images shown in Figure 5. The intensities of each pixel in the obtained rule images represent the SAM angles between the endmember and the spectrum extracted from a given hyperspectral pixel [21], with pixel darkness being proportionate to the similarity between these spectra. Paprika powder showed less dark pixels (larger angles) than those of the adulterants owing to the higher dissimilarity of the former and the reference spectrum. Thus, the dark pixels in Figure  5 represent adulterant particles, with the obtained results confirming the high potential of SAM for identifying and locating the spatial positions of individual adulterant particles based on the Raman characteristics of these adulterants. Since the rule images showed a distinct difference between adulterant and food powder pixels, the obtained images could be subjected to thresholding to develop a quantitative analysis model.

Quantitative Analysis
Images obtained using each method were subjected to image thresholding to calculate the number of individual adulterant pixels at each concentration and thus develop a quantitative model and compare the performances of each technique. The preprocessed band images (for univariate analysis) and band multiplication images (for bivariate analysis) were thresholded by converting all pixels with intensities below the threshold value into the background, with those with intensities above the threshold value ascribed to adulterant particles. The optimum threshold values were determined separately for each adulterant by examining the Raman spectra of pixels classified as either adulterant or paprika powder. Binary images were generated from rule images ( Figure 5) by applying a threshold value calculated as mentioned in the Materials and Methods section. The contents of individual adulterants were then determined by calculating the percentage of adulterant pixels in binary images.
To visualize the spatial distribution of each adulterant in paprika powder, images obtained by univariate analysis were combined as shown in Figure 6, with a similar procedure performed for bivariate and multivariate analyses. The concentration of color-coded pixels was low in images of the 0.1 wt % samples, increasing with increasing adulterant content and thus confirming that the detected number of component pixels was linearly correlated to adulterant content (R > 0.95 for each adulterant detected using different analysis methods).

Quantitative Analysis
Images obtained using each method were subjected to image thresholding to calculate the number of individual adulterant pixels at each concentration and thus develop a quantitative model and compare the performances of each technique. The preprocessed band images (for univariate analysis) and band multiplication images (for bivariate analysis) were thresholded by converting all pixels with intensities below the threshold value into the background, with those with intensities above the threshold value ascribed to adulterant particles. The optimum threshold values were determined separately for each adulterant by examining the Raman spectra of pixels classified as either adulterant or paprika powder. Binary images were generated from rule images ( Figure 5) by applying a threshold value calculated as mentioned in the Materials and Methods section. The contents of individual adulterants were then determined by calculating the percentage of adulterant pixels in binary images.
To visualize the spatial distribution of each adulterant in paprika powder, images obtained by univariate analysis were combined as shown in Figure 6, with a similar procedure performed for bivariate and multivariate analyses. The concentration of color-coded pixels was low in images of the 0.1 wt % samples, increasing with increasing adulterant content and thus confirming that the detected number of component pixels was linearly correlated to adulterant content (R > 0.95 for each adulterant detected using different analysis methods).

Comparision of Analysis Methods
The three different analysis methods were compared based on their ease of implementation and performance in determining adulterant concentration (Table 1). Table 1 lists the pixel-based detected concentrations of adulterants in each sample. A linear relationship between the pixel-based detected percentage of adulterants and actual concentration of adulterants in the mixture was developed for quantitative analysis. Table 1 shows that the adulterant contents determined using univariate and bivariate imaging were very similar, with slightly better performance observed for the latter method.

Comparision of Analysis Methods
The three different analysis methods were compared based on their ease of implementation and performance in determining adulterant concentration (Table 1). Table 1 lists the pixel-based detected concentrations of adulterants in each sample. A linear relationship between the pixel-based detected percentage of adulterants and actual concentration of adulterants in the mixture was developed for quantitative analysis. Table 1 shows that the adulterant contents determined using univariate and bivariate imaging were very similar, with slightly better performance observed for the latter method.
This behavior was ascribed to the suppression of noise by image multiplication, whereas some false positive pixels were calculated for Sudan-I (inside the black box in Figure 6) using the univariate method. The pixels marked in Figure 6 corresponded to high-frequency noise in the spatial domain, which was viewed as Sudan dye pixels during image thresholding. However, when the bivariate method was used and images were multiplied, this noise was suppressed, as the adulterant pixel intensity was enhanced. The obtained results revealed that the SAM method underestimated the adulterant content (particularly in the case of Sudan-I) for low-concentration samples and overestimated it in the case of highly concentrated samples (e.g., mixture 4), as compared to the results of the univariate and bivariate methods. This behavior was attributed to the fact that in low-concentration samples, the minor peaks of adulterants were not clearly distinguished from those of the paprika powder background, while being fairly intense in the case of concentrated samples. To re-state this, the SAM algorithm calculates the similarity between the spectra of the reference material and investigated pixels. If the pixel spectra are strongly influenced by those of the background material (paprika powder), the calculated angles lie between those of paprika powder and the adulterant of interest. Thus, the above pixels are considered to be sub-pixels and are missed during image thresholding. In order to correct this sub-pixel effect, the threshold values were also tested as slightly higher and lower than the calculated median values. However, for the higher threshold value, some of the background pixels were identified as adulterants (over-classifying) and at the lower threshold value, some adulterants pixels were missed (under-classifying).
In an another approach, to check the similarity between the adulterant particles detected using the three analysis methods, the binary images of same-concentration samples generated using these methods were linked together for pixel-to-pixel comparison, as shown in Figure 7. At each concentration level, the binary images generated using the three different methods were combined and the detected adulterant pixels for each method were color coded-pixels in green represent the results from the univariate method, blue represents the bivariate method, and red represents the multivariate (SAM) analysis. Combination pixels that were detected using two methods (any combination of two methods from the three used analysis methods) were represented in orange, pink, and baby blue and pixels detected as adulterants by all methods are shown in black. As can be seen, the majority of pixels in these images were black, implying that all of the three methods assigned the same pixels as adulterants. However, some pixels corresponding to relatively small spots were determined as adulterants by the univariate and bivariate methods (brown color) but not by SAM, probably due to the latter method viewing them as subpixels and thus not classifying them as adulterants for the reason mentioned above. However, the visual evaluation of the spectra of these pixels confirmed their assignment as adulterants.
Therefore, the obtained results revealed that although univariate imaging is the simplest and most convenient method, it is not applicable when the intense peaks of target adulterants overlap with those of other sample constituents. Moreover, the presence of high-frequency noise, which is common for agricultural and biological materials, can result in the observation of false positive pixels. This problem can be resolved by utilizing bivariate imaging, which ultimately enhances adulterant particle intensities and thus suppresses noise. This approach allows the straightforward and rapid analysis of Raman imaging data due to using only two data points and can thus be considered an effective analysis method for the online application of Raman imaging in the agro-food sector. On the other hand, SAM imaging, considering the complete dataset, requires an increased amount of computation time and has not been proven effective for low-concentration samples. It is worth mentioning that binary SAM images featured a lower number of mixed pixels (shown in black color in Figure 6) than those obtained by the other two techniques. Moreover, SAM analysis exhibits the advantage of increased reproducibility due to being the least sensitive to the slightly time-dependent spectral intensity, but being highly sensitive to the spectral pattern. Summarizing the comparison of the data analysis methods, it can be concluded that the bivariate method (band multiplication) is most suitable for the quantitative analysis of two different adulterants (Sudan dye and Congo red dye) in paprika powder. The pixel-based detected concentration of adulterants is almost consistent with the added concentration, unlike SAM analysis, which shows false negative pixels for low concentration samples and over-estimation for high concentration samples. Moreover, bivariate analysis uses a very small portion of data sets (only two band images), hence reducing the computation time and facilitating the real-time visualization of Raman chemical images of adulterants, which is not possible using the SAM analysis method owing to the aforementioned complications in data analysis. mentioning that binary SAM images featured a lower number of mixed pixels (shown in black color in Figure 6) than those obtained by the other two techniques. Moreover, SAM analysis exhibits the advantage of increased reproducibility due to being the least sensitive to the slightly time-dependent spectral intensity, but being highly sensitive to the spectral pattern. Summarizing the comparison of the data analysis methods, it can be concluded that the bivariate method (band multiplication) is most suitable for the quantitative analysis of two different adulterants (Sudan dye and Congo red dye) in paprika powder. The pixel-based detected concentration of adulterants is almost consistent with the added concentration, unlike SAM analysis, which shows false negative pixels for low concentration samples and over-estimation for high concentration samples. Moreover, bivariate analysis uses a very small portion of data sets (only two band images), hence reducing the computation time and facilitating the real-time visualization of Raman chemical images of adulterants, which is not possible using the SAM analysis method owing to the aforementioned complications in data analysis.

Conclusions
Herein, we evaluated three different methods of Raman imaging data analysis, showing that all of them were effective for adulterant screening in food powder and eventually developing a quantitative model. Notably, although the univariate method is the simplest and most easy to implement, is suffers from an increased risk of generating false positive pixels and does not perform well when the peaks of the target adulterant overlap with those of another sample constituent. On the other hand, SAM considers the complete dataset but did not perform very well in this study, possibly for reasons set out above. Conversely, bivariate analysis relies on the multiplication of two band intensities, enhancing adulterant signals and suppressing background noise, and can thus be considered a simple but effective Raman imaging method, with the produced Raman maps providing valuable information on the spatial distribution of adulterant particles and thus allowing the further development of accurate quantitative models.
However, in this study, the comparison of the three different data analysis methods was carried out with a single data set (adulterated paprika powder), therefore the obtained results are practically valid only for that particular data set. However, the findings of this study show that the univariate method is sensitive to background noise, which can appear in Raman band images of any kind of powdered food sample, and the SAM-based method considers the spectral pattern regardless of the particular adulterant (used in this study). Therefore, we believe that the findings of this study can be used to select appropriate methods for Raman imaging data analysis of any kind of (adulterated) powdered food sample.