The Changes in Bell Pepper Flesh as a Result of Lacto-Fermentation Evaluated Using Image Features and Machine Learning

Food processing allows for maintaining the quality of perishable products and extending their shelf life. Nondestructive procedures combining image analysis and machine learning can be used to control the quality of processed foods. This study was aimed at developing an innovative approach to distinguishing fresh and lacto-fermented red bell pepper samples involving selected image textures and machine learning algorithms. Before processing, the pieces of fresh pepper and samples subjected to spontaneous lacto-fermentation were imaged using a digital camera. The texture parameters were extracted from images converted to different color channels L, a, b, R, G, B, X, Y, and Z. The textures after selection were used to build models for the classification of fresh and lacto-fermented samples using algorithms from the groups of Lazy, Functions, Trees, Bayes, Meta, and Rules. The highest average accuracy of classification reached 99% for the models developed based on sets of selected textures for color space Lab using the IBk (instance-based K-nearest learner) algorithm from the group of Lazy, color space RGB using SMO (sequential minimal optimization) from Functions, and color space XYZ and color channel X using IBk (Lazy) and SMO (Functions). The results confirmed the differences in image features of fresh and lacto-fermented red bell pepper and revealed the effectiveness of models built based on textures using machine learning algorithms for the evaluation of the changes in the pepper flesh structure caused by processing.


Introduction
Bell pepper (Capsicum annum L.), belonging to Solanaceae, is a widely cultivated fruit, used as a vegetable, spice, or condiment. Consuming pepper can provide health benefits due to the presence of phytochemicals, including phenolic compounds, capsaicinoids, vitamins C and E, and carotenoids [1]. Processing of pepper fruit can prolong the storage time and provide value-added food products [1]. Strong antioxidant capacity, high content of bioactive compounds, distinctive colors, flavor, and nutritional value result in the popularity of consuming bell pepper. Bell pepper color may be related to significant differences in taste, antioxidant capacity, bioactive compounds, nutrient content, antioxidant capacity, and cost. Green bell pepper contains chlorophylls and distinctive carotenoids (lutein, neoxanthin, and violaxanthin) [2]. Red pepper is characterized by the presence of capsorubin and capsanthin [2]. In the case of yellow pepper, violaxanthin, β-carotene, lutein, zeaxanthin, and antheraxanthin are the most common. The higher concentrations of polyphenols, among others, flavonoid and quercetin, are determined in red and yellow peppers compared to green fruit [2]. Due to the presence mainly of carotenoids, flavonoids, and vitamins, bell pepper is an important ingredient against aging and preventing chronic algorithms were successfully used in previous studies for evaluation of the changes, e.g., in carrot [22], cucumber [23], and beetroot [24,25], caused by lacto-fermentation.
This study offers an innovative and comprehensive approach to assessing the quality of bell pepper fruit. In this context, the supplied red bell pepper sample pieces are imaged before and after lacto-fermentation. Each of these images is converted to L, a, b, R, G, B, X, Y, and Z color channels. Using the brightness threshold, each sample is segmented from the background, and regions of interest (ROI) are generated. Then, various texture features are extracted from ROI images with different color channels. Texture features extracted from different color channels show the effect of different textural information on the discrimination process. Feature selection algorithms are used for a large number of texture features obtained at this stage. Finally, these selected features are analyzed using various machine learning algorithms, resulting in highly accurate discrimination of fresh and lacto-fermented red bell pepper samples. The results show that the proposed method is capable of distinguishing with an accuracy of up to 99%.
Despite the published results, mainly for other species of fruit and vegetables, there are no literature data on the use of texture parameters extracted from various color channels of images of cut red bell pepper to build classification models for monitoring the effect of lacto-fermentation on changes in the structure of pepper flesh. Therefore, the objective of this study was to propose an innovative approach to distinguishing fresh and lactofermented red bell pepper samples involving selected texture parameters of images and various machine learning algorithms. The application of textures from different color channels L, a, b, R, G, B, X, Y, and Z and algorithms belonging to different groups to evaluate the changes in red bell pepper flesh occurring as a result of spontaneous lactofermentation is a great novelty of this study. The innovative nature of the present study involves the acquisition of data on more than 1600 texture parameters of bell pepper flesh in the fresh and lacto-fermented forms. Various methods of selecting textures with the highest discriminant power, such as genetic search and best first with the correlation-based feature selection (CFS), as well as the ranker in conjunction with OneR attribute evaluator, were used to choose the most effective one. A wide range of algorithms from the groups of Lazy, Functions, Trees, Bayes, Meta, and Rules were tested, which was not found in the literature data for the classification of fresh and processed bell pepper. The use of various algorithms in the discrimination process shows the validity, applicability, and robustness of the proposed method rather than making a comparison between machine learning algorithms. This paper is organized as follows: after the introduction, Section 2 deals with the acquisition, imaging, processing, and statistical analysis of bell pepper samples. Section 3 includes the discrimination performances obtained as a result of evaluating the texture features extracted from different color channels with different machine learning algorithms. Lastly, Section 4 evaluates the proposed work and provides suggestions for future work.

Materials
The red bell pepper samples ( Figure 1) were bought at a supermarket. A total of 20 mature undamaged fruits were selected for this study. Fruits were washed and cleaned. Then, five pieces with the dimensions of 1 cm × 1 cm were cut from each pepper fruit using a sharp stainless-steel knife. The extracted fresh pepper pieces were subjected to imaging. The same pieces were also intended for spontaneous lacto-fermentation and then imaging as lacto-fermented forms.
The spontaneous lacto-fermentation using garlic, horseradish, dill, and potable water with table salt at the final concentration of 3% sodium chloride in brine was applied. The previously prepared red bell pepper pieces were put into glass jars with other ingredients. Samples were stored for 3 days at a temperature of about 20 • C and then for 6 months at about 10-12 • C. After storage, lacto-fermented pepper samples were rinsed under potable The spontaneous lacto-fermentation using garlic, horseradish, dill, and potable water with table salt at the final concentration of 3% sodium chloride in brine was applied. The previously prepared red bell pepper pieces were put into glass jars with other ingredients. Samples were stored for 3 days at a temperature of about 20 °C and then for 6 months at about 10-12 °C. After storage, lacto-fermented pepper samples were rinsed under potable water and dried with a paper towel. The samples prepared in this way were subjected to imaging using a digital camera.

Image Acquisition and Processing
The same imaging procedure was applied for the pieces of fresh pepper before processing and lacto-fermented pepper. The samples were imaged on the inside of the flesh using a digital camera placed on a tripod in a box with black internal walls. As a light source, LED (light-emitting diode) illumination with stable parameters was used. Color calibration of the digital camera was carried out. The acquired images contained red bell pepper pieces on a black background. There were 20 pepper pieces in one image. In total, images of 100 pieces of fresh pepper and 100 pieces of lacto-fermented pepper were obtained. The image processing using the MaZda application (Łódź University of Technology, Institute of Electronics, Łódź, Poland) [21,26,27] allowed computing texture parameters from different color channels L, a, b, R, G, B, X, Y, and Z. Firstly, the fresh and lactofermented bell pepper images were converted to color channels. In the case of lacto-fermented samples, changes in the structure of the flesh compared to the fresh samples are visible ( Figure 2).

Image Acquisition and Processing
The same imaging procedure was applied for the pieces of fresh pepper before processing and lacto-fermented pepper. The samples were imaged on the inside of the flesh using a digital camera placed on a tripod in a box with black internal walls. As a light source, LED (light-emitting diode) illumination with stable parameters was used. Color calibration of the digital camera was carried out. The acquired images contained red bell pepper pieces on a black background. There were 20 pepper pieces in one image. In total, images of 100 pieces of fresh pepper and 100 pieces of lacto-fermented pepper were obtained. The image processing using the MaZda application (Łódź University of Technology, Institute of Electronics, Łódź, Poland) [21,26,27] allowed computing texture parameters from different color channels L, a, b, R, G, B, X, Y, and Z. Firstly, the fresh and lacto-fermented bell pepper images were converted to color channels. In the case of lacto-fermented samples, changes in the structure of the flesh compared to the fresh samples are visible ( Figure 2).
The regions of interest (ROIs) were overlaid. An ROI was considered as a set of pixels separated from the background. The segmentation of the image into lighter pepper pieces and the black background was performed using the brightness threshold which was determined manually. Each ROI was one whole piece of pepper. Thus, 200 ROIs including 100 ROIs of fresh samples and 100 ROIs of lacto-fermented samples were determined. For each ROI, 1629 image textures were computed including 181 textures for each color channel. The texture parameters were computed on the basis of the co-occurrence matrix (132 textures including 11 features for four directions and three between-pixel distances), run-length matrix (20 textures including five textures for four directions), Haar wavelet transform (10 textures), histogram (nine textures), autoregressive model (five textures), ad gradient map (five textures). Among the color channels of images selected for the texture extraction, color channels R (red), G (green), and B (blue) belonged to the RGB color space, channels L (lightness from black to white), a (green for negative and red for positive values), and b (blue for negative and yellow for positive values) belonged to the Lab color space, and channels X (a component of color information), Y (lightness), and Z (a component of color information) belonged to the XYZ color space [28]. The computed image textures were considered as a function of the spatial variation of the pixel brightness intensity and provided information about the structure of the samples. Thus, the quantitative analysis of textures provided important insights into sample quality. The texture parameters after selection were used to build the discriminative models for distinguishing fresh and lactofermented red bell pepper samples. A flowchart including steps of the applied procedure is presented in Figure 3. The regions of interest (ROIs) were overlaid. An ROI was considered as a set of pixels separated from the background. The segmentation of the image into lighter pepper pieces and the black background was performed using the brightness threshold which was determined manually. Each ROI was one whole piece of pepper. Thus, 200 ROIs including 100 ROIs of fresh samples and 100 ROIs of lacto-fermented samples were determined. For each ROI, 1629 image textures were computed including 181 textures for each color channel. The texture parameters were computed on the basis of the co-occurrence matrix (132 textures including 11 features for four directions and three between-pixel distances), runlength matrix (20 textures including five textures for four directions), Haar wavelet transform (10 textures), histogram (nine textures), autoregressive model (five textures), ad gradient map (five textures). Among the color channels of images selected for the texture extraction, color channels R (red), G (green), and B (blue) belonged to the RGB color space, channels L (lightness from black to white), a (green for negative and red for positive values), and b (blue for negative and yellow for positive values) belonged to the Lab color space, and channels X (a component of color information), Y (lightness), and Z (a component of color information) belonged to the XYZ color space [28]. The computed image tive analysis of textures provided important insights into sample quality. The texture parameters after selection were used to build the discriminative models for distinguishing fresh and lacto-fermented red bell pepper samples. A flowchart including steps of the applied procedure is presented in Figure 3.

Statistical Analysis
Statistical analysis involved the development of innovative models for distinguishing fresh and lacto-fermented red bell pepper samples on the basis of image features. The discriminant analysis was performed using WEKA software (Machine Learning Group, University of Waikato) [29][30][31] and included several steps. In the first step, the texture selection was performed. This procedure was applied for sets of textures computed for color spaces Lab, RGB, and XYZ including combined textures for three channels in the case of each color space, and for individual color channels L, a, b, R, G, B, X, Y, and Z. In the case of each color space, the most satisfactory results for one channel were chosen to be presented in this paper. Among the search methods, genetic search and best first with the correlation-based feature selection (CFS), and the ranker in conjunction with OneR attribute evaluator were applied. Various algorithms from different groups were tested such as Lazy (LWL-locally weighted learning, KStar, IBk-instance-based K-nearest learner), Functions (logistic, LDA-linear discriminant analysis, FLDA-Fisher linear discriminant analysis, QDA-quadratic discriminant analysis, SMO-sequential minimal optimization), Trees (LMT-logistic model tree, random forest, J48), Bayes (Bayes net, naïve Bayes), Meta (multi class classifier, filtered classifier, random committee, logit boost),

Statistical Analysis
Statistical analysis involved the development of innovative models for distinguishing fresh and lacto-fermented red bell pepper samples on the basis of image features. The discriminant analysis was performed using WEKA software (Machine Learning Group, University of Waikato) [29][30][31] and included several steps. In the first step, the texture selection was performed. This procedure was applied for sets of textures computed for color spaces Lab, RGB, and XYZ including combined textures for three channels in the case of each color space, and for individual color channels L, a, b, R, G, B, X, Y, and Z. In the case of each color space, the most satisfactory results for one channel were chosen to be presented in this paper. Among the search methods, genetic search and best first with the correlation-based feature selection (CFS), and the ranker in conjunction with OneR attribute evaluator were applied. Various algorithms from different groups were tested such as Lazy (LWL-locally weighted learning, KStar, IBk-instance-based K-nearest learner), Functions (logistic, LDA-linear discriminant analysis, FLDA-Fisher linear discriminant analysis, QDA-quadratic discriminant analysis, SMO-sequential minimal optimization), Trees (LMT-logistic model tree, random forest, J48), Bayes (Bayes net, naïve Bayes), Meta (multi class classifier, filtered classifier, random committee, logit boost), and Rules (JRip-Java repeated incremental pruning, PART). A test mode of 10-fold cross-validation was used to perform the analysis. The dataset including a total of 200 cases was randomly divided into 10 parts. Each of the ten parts was considered in turn and treated as the test set, whereas the remaining nine parts were regarded as the training sets. The learning was performed 10 times using different training sets. The result was determined as the average of 10 estimates. Ten folds are generally sufficient to obtain the best estimate [31]. The criterion for the selection of machine learning algorithms and evaluation of analysis was the highest average accuracy of classification. In addition to average accuracy and confusion matrices including accuracies for fresh and lacto-fermented samples, other performance metrics, such as precision, F-measure, and MCC (Matthews correlation coefficient), were also determined [32,33].

Results
The innovative models based on selected image textures for distinguishing fresh and lacto-fermented red bell pepper samples were developed using machine learning algorithms. The models were built separately for color spaces and color channels. From the tested search methods, best first with CFS proved to be the most satisfactory in terms of sets of selected textures providing the highest accuracies. The textures selected using the best first with CFS are presented in Table 1. It was observed that the highest results were ensured by models developed using the following machine learning algorithms: IBk (Lazy), SMO (Functions), random forest (Trees), naïive Bayes (Bayes), filtered classifier (Meta), and JRip (Rules).
In the case of the models built on the basis of selected textures from the color space Lab (Table 2), an average accuracy of discrimination of fresh and lacto-fermented pepper pieces reached 99% for the Ibk algorithm from the group of Lazy. Only this algorithm provided 100% accuracy for one of the classes (lacto-fermented). The fresh pepper samples were discriminated with an accuracy of 98%, and the remaining 2% were incorrectly included in the class 'lacto-fermented'. Other discrimination performance metrics were also the most satisfactory for the model built using IBk. The values of precision reached 1.000 for fresh pepper samples and 0.980 for lacto-fermented samples, whereas both classes were characterized by an F-measure parameter equal to 0.990 and MCC equal to 0.980. For the other algorithms, high values of metrics were also obtained. The models built using the SMO algorithm from the group of Functions, random forest from Trees, and naïve Bayes from Bayes correctly discriminated fresh and lacto-fermented pepper samples in 98% of cases, while other metrics were greater than or equal to 0.960, reaching 0.990 for precision for fresh samples and random forest for lacto-fermented samples, as well as naïve Bayes. Slightly lower average accuracies were observed for filtered classifier from the group of Meta (97%) and Jrip from the group of Rules (96.5%). In the case of Lab color space, among the individual color channels, models including textures selected from color channel L (Table 3) provided the most satisfactory results. The fresh and lacto-fermented pepper samples were correctly distinguished from each other with an average accuracy of up to 98.5% for the random forest. The fresh pepper samples were classified with an accuracy of 98%, whereas lacto-fermented pepper samples were classified with an accuracy of 99%. For models developed using other machine learning algorithms, very high average accuracies were also found: 98% for IBk and SMO, 97% for naïve Bayes and filtered classifier, and 96.5% for JRip. The accuracy of 100% was not observed for any of the classes. The highest value of precision of 0.990 was determined for the fresh pepper and random forest. The F-measure of 0.985 and MCC of 0.970 for both classes were also the highest for the random forest algorithm.
Furthermore, high average accuracies reaching 99% (SMO algorithm) were determined for models including image textures selected for RGB color space (Table 4). For SMO, all cases belonging to fresh pepper were correctly classified as fresh pepper (100% accuracy), and 98% of cases from the actual class 'lacto-fermented' were correctly included in the predicted class 'lacto-fermented'. The values of precision (0.980 for fresh pepper samples and 1.000 for lacto-fermented pepper samples), F-measure (0.990 for both classes), and MCC (0.980 for both classes) were also very satisfactory. Moreover, a high average accuracy of 98.5% was observed for the IBk and naïve Bayes machine learning algorithms. However, in the case of some models, the average accuracies were lower, such as 93% for the model built using filtered classifier and 93.5% for the model developed using JRip. From color channels belonging to RGB color space, the highest results of discrimination of fresh and lacto-fermented red bell pepper samples were obtained for models built on the basis of selected image textures from channel R ( Table 5). An average accuracy of up to 98.5% (SMO) was determined. In the case of the SMO algorithm which provided the most satisfactory results, the discrimination accuracy for fresh pepper was equal to 99%, and 1% of samples were incorrectly classified as lacto-fermented pepper, whereas lacto-fermented samples were correctly discriminated in 98% of cases, and the remaining 2% were incorrectly included in the predicted class 'fresh pepper'. The values of precision reached 0.980 and 0.990, for fresh and lacto-fermented samples, respectively. The highest F-measure equal to 0.985 and MCC equal to 0.970 were found for both classes. Among the other algorithms, Ibk (97.5%), random forest (97%), and naïve Bayes (96%) also allowed for building effective models. In the case of filtered classifier and JRip, lower average accuracies (91.5% and 90.5%, respectively) were determined.
Very high results were also produced by the models built on the basis of selected textures from the XYZ color space ( Table 6). The average accuracy reached 99% in the case of the models built using the IBk and SMO machine learning algorithms. A slightly lower average accuracy of 98% was obtained for random forest and naïve Bayes. Furthermore, for filtered classifier and JRip, average accuracies were satisfactory (97% and 96.5%, respectively). In the case of individual classes, 100% accuracy was observed for fresh pepper for the SMO algorithm, whereas lacto-fermented cases were correctly distinguished from fresh samples in 98% of cases. The highest results of precision, F-measure, and MCC were determined for the models built using the IBk and SMO algorithms. In the case of IBk, precision reached 0.990 for both classes, while, for SMO, the value of precision was equal to 1.000 for the lacto-fermented pepper and 0.980 for the fresh pepper. Both for IBk and SMO, F-measure reaching 0.990 and an MCC of 0.980 for both classes were obtained.
For models developed using selected textures from images converted to the color channel X (Table 7), the same results as for color space XYZ (Table 6) were observed for the IBk and SMO algorithms. The average accuracy also reached 99% (IBk, SMO), and an accuracy of 100% was only found for fresh pepper (SMO). In the case of color channel X (Table 7), for other algorithms, fresh lacto-fermented pepper samples were correctly distinguished from each other in 98.5% of cases for naïve Bayes, 97% of cases for random forest, 95.5% of cases for filtered classifier, and 93.5% of cases for JRip.     When Tables 2-7, which contain the results of this study, are examined, the findings regarding the experimental results can be expressed as follows: with image analysis, artificial intelligence, and computer vision algorithms, fresh and lacto-fermented red bell pepper samples were distinguished in an objective, nondestructive, and inexpensive way. In this way, the quality of the crop was determined automatically, quickly, and without bias [34]. Furthermore, the texture features of the different color channels used (L, a, b, R, G, B, X, Y, and Z) strongly represented the changes in lacto-fermented red bell pepper samples. It was confirmed that these texture features can also be strongly distinguished by various machine learning algorithms from different groups (Lazy, Functions, Trees, Bayes, Meta, and Rules). According to the experimental results, the most erroneous discrimination was 90.5% with the Jrip algorithm, while the most successful discrimination was 99% with the IBk algorithm. Therefore, these results showed that the proposed method is highly preferable over conventional methods. Moreover, the results determined for red bell pepper in the present study confirmed the previous literature data on the effectiveness of models built based on image parameters using machine learning algorithms for the evaluation of the quality of fermented food products such as cucumber, carrot, or beetroot [22][23][24][25]. The undertaken research extends the scope of application of image analysis and artificial intelligence to the evaluation of fruit and vegetables by assessing the quality of fermented products.
Image processing and traditional machine learning and deep learning are also widely used for other fruit and vegetable research and agriculture activities. Machine learning techniques and algorithms can be used in the pre-harvesting, harvesting, and post-harvesting stages [34]. This state-of-art technology can help to solve problems in agriculture and help farmers to reduce losses. In the pre-harvesting stage, machine learning can be applied, for example, to evaluate the quality and germination of the seeds, sort seeds, detect disease and weeds, capture the parameters of soil, pruning, fertilizer application, irrigation, and determining genetic and environmental conditions. In the harvesting stage, the activities involving machine learning may be related to crop size, quality, skin color, taste, maturity stage, firmness, market window, object detection, and classification [35]. On the other hand, post-harvesting applying machine learning may concern factors affecting the shelf-life, e.g., temperature, gases used in containers, humidity, usage of chemicals, handling processes to quality retain, and grading [35]. In the case of pepper, machine learning was used for nondestructive sorting based on odor parameters [36]. Models developed using a deep convolutional neural network (DCNN) were applied by Subeesh et al. [14] for weed detection in the polyhouse cultivation of bell peppers. Mohi-Alden et al. [37] used machine vision intelligent modeling for in-line sorting of bell pepper. Red and yellow sweet peppers were classified into immature and mature classes on the basis of color and morphological features using machine learning algorithms [38]. Additionally, the ripeness level was estimated on the basis of color image parameters using machine learning models in the case of grapes [39]. Image analysis, spectroscopy, and electronic nose combined with different statistical models were applied to discriminate the ripening stage of strawberries [40]. Models based on spectral reflectance data built using machine learning algorithms were used to distinguish the Fusarium-infected and healthy pepper samples (leaves) [41]. In view of the promising application of objective and nondestructive procedures involving artificial intelligence and image processing to evaluate fruit and vegetables, further research may include other species and cultivars, as well as novel directions of experiments.

Conclusions
Preservation and processing of bell peppers by lacto-fermentation is a traditional method to extend their shelf life. In this context, the main justification for this study was to evaluate the effect of the lacto-fermentation method on red bell peppers. The approach involving image analysis and machine learning enabled distinguishing fresh and lacto-fermented red bell pepper samples (flesh pieces) on the basis of models including selected textures built using various algorithms. Models developed for sets of textures selected separately for color spaces and color channels provided satisfactory results. The highest results, including the average discrimination accuracy reaching 99%, were obtained for IBk (Lazy), SMO (Functions), random forest (Trees), naïve Bayes (Bayes), filtered classifier (Meta), and JRip (Rules) machine learning algorithms. The high discrimination accuracies, as well as the values of other metrics, such as precision, F-measure, and MCC (Matthews correlation coefficient), revealed differentiation of fresh and lacto-fermented bell pepper samples. Therefore, the changes in the bell pepper flesh structure caused by spontaneous lacto-fermentation were confirmed. Due to the evaluation of the quality of lacto-fermented bell pepper using textures from different color channels L, a, b, R, G, B, X, Y, and Z and various machine learning algorithms belonging to different groups, this study is characterized by novelty. The obtained results are very promising. Therefore, further research on the evaluation of the effect of lacto-fermentation on the flesh of other cultivars of pepper and other species of fruit or vegetables may be performed.
The texture features extracted from the different color channels strongly indicated the effect of lacto fermentation. However, a single texture feature may not be sufficient to distinguish lacto-fermented red bell pepper samples. The features extracted in machine learning algorithms determine the performance of the system. The features should, there-fore, strongly represent the distinction between species. Therefore, in future studies, more than one texture feature will be fused, and the effect of lacto-fermentation will be revealed more strongly. In addition, the new trend for agricultural discrimination tasks is to develop deep learning-based methods. Accordingly, more powerful features are automatically extracted instead of manual feature extraction. Therefore, deep learning-based techniques can provide easier and more accurate discrimination. However, to see the superior performance of deep learning clearly, more samples are required than machine learning. Therefore, in our next studies, the effect of lacto-fermentation will be analyzed with deep learning-based CNN models that enable the extraction of high-level features. For this, datasets containing more samples will be prepared. Lastly, the proposed study is only for bell pepper and, therefore, does not produce good results in differentiating different vegetables or fruits as a result of lacto fermentation. In this sense, it is planned to create a large dataset containing different fruits and vegetables in future studies.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.