Hyperspectral Imaging-Based Multiple Predicting Models for Functional Component Contents in Brassica juncea

: Partial least squares regression (PLSR) prediction models were developed using hyperspectral imaging for noninvasive detection of the ﬁve most representative functional components in Brassica juncea leaves: chlorophyll, carotenoid, phenolic, glucosinolate, and anthocyanin contents. The region of interest for functional component analysis was chosen by polygon selection and the extracted average spectra were used for model development. For pre-processing, 10 combinations of Savitzky–Golay ﬁlter (S. G. ﬁlter), standard normal variate (SNV), multiplicative scatter correction (MSC), 1st-order derivative (1st-Der), 2nd-order derivative (2nd-Der), and normalization were applied. Root mean square errors of calibration (RMSEP) was used to assess the performance accuracy of the constructed prediction models. The prediction model for total anthocyanins exhibited the highest prediction level (R V2 = 0.8273; RMSEP = 2.4277). Pre-processing combination of SNV and 1st-Der with spectral data resulted in high-performance prediction models for total chlorophyll, carotenoid, and glucosinolate contents. Pre-processing combination of S. G. ﬁlter and SNV gave the highest prediction rate for total phenolics. SNV inclusion in the pre-processing conditions was essential for developing high-performance accurate prediction models for functional components. By enabling visualization of the distribution of functional components on the hyperspectral images, PLSR prediction models will prove valuable in determining the harvest time.


Introduction
Brassica juncea is a member of the family Brassicaceae whose leaves contain a variety of functional components including chlorophyll, carotenoid, flavonoid, and phenolic components, as well as glucosinolate and anthocyanin components [1][2][3][4][5]. Functional component content in crop plants can vary extensively within a wide range, depending on the cultivation conditions. The bioactivities of such functional components and their applications as pharmacological agents, health functional foods [6], and cosmetic materials [7] have been extensively studied. For example, chlorophylls and anthocyanins are components that confer the characteristic green and red colors to plant organs, respectively, and they may also indicate crop management or growth conditions in wilted or diseased plant organs [8,9]. In addition, carotenoids have been investigated with respect to their biological effects in anti-obesity activity [10], and flavonoids, phenolics, and anthocyanins have been investigated for anti-obesity [11], antioxidant, and antimicrobial activities [12,13]. The glucosinolate derivative, isothiocyanate, has been investigated for anticancer [14,15] and antimicrobial activities [16].
Recently, there has been growing interest in detection and quantification of specific bioactive substances for targeted applications. Concomitantly, studies involving the development of faster and efficient detection methods have attracted attention. The most widely used detection method involves the collection of plant samples coupled with invasive extraction using organic solvents, followed by analysis of the extracts. Such detection methods involving organic solvents and chemical reagents are time consuming, less efficient, and adversely affect the environment. To overcome these disadvantages of chemical methods, many scientists have conducted chemometrics studies based on spectroscopic knowledge to analyze the functional components in plants, such as High-Performance Liquid Chromatography (HPLC) and Near-Infrared (NIR) [17], or Fourier Transform-NIR (FT-NIR) spectroscopy [18,19]. However, these methods are invasive and inefficient, and data processing is time consuming. Alternatively, recent studies have focused on noninvasive methods. For example, a hyperspectral imaging system was used to explore various parameters including plant-derived functional components, such as total chlorophylls and total carotenoids [20], total capsaicinoids [17], total glucosinolates [21], total flavonoids [22][23][24], and total polysaccharides [24], as well as microbial contamination of fish flesh [25], fruit moisture content [26], and sulfite dioxide residues on fruit surfaces [27]. Among the studies aimed at developing prediction models for functional components, prediction of total anthocyanins [28] at R C 2 = 0.883 and R V 2 = 0.830 and total polyphenols [29] at R C 2 = 0.820 and R V 2 = 0.551 have been reported. However, in these studies, only one or two functional components were targeted for prediction model construction by using hyperspectral images.
In this study, it was hypothesized that multiple components could be predicted noninvasively and spontaneously from hyperspectral images from the leaves of B. juncea. In order to develop a prediction model for multiple components, a partial least squares regression (PLSR) method was used, with 10 pre-processing combinations. Compared to other regression models, the PLSR method enables the development of a predictive model with high accuracy using only a relatively small amount of data. For this reason, it is a representative method widely used to make industrialized products because it reduces the data acquisition time, and the model application is simple. Total chlorophylls, total carotenoids, total phenolics, total glucosinolates, and total anthocyanins were selected as target functional components for prediction, because these components are the most general functional components in plants which have a beneficial effect for human health and can be estimated by spectrophotometric method after simple solvent extraction of plant samples. B. brassica was cultivated under various conditions to obtain functional components with a wide range of concentrations. The harvested leaves were used for hyperspectral image and functional component quantification. In addition, visualization software was developed to detect real-time distribution of target components. The results in this study suggest the possibility of noninvasive multiple component prediction from one hyperspectral image. In addition, models and visualization software were applied to agricultural systems (such as growth chamber, indoor farm, and greenhouse) to monitor real-time functional components.

Plant Growth Conditions
To obtain Brassica juncea (L.) Czern. leaf samples with varying concentrations of functional components, three cultivation environments were implemented: an indoor farm, a greenhouse, and an open field ( Figure 1). B. juncea was cultivated in an indoor farm in a hydroponic system under red, blue, and green LED light combination at 18-23 • C; Hoagland nutrient solution with an electrical conductivity (EC) value of 1.5 dS/m was used as a growth medium. Greenhouse and open field cultivation were carried out on the soil with fertilizer components at 20-28 • C for greenhouse and 15-20 • C under the sunlight for open field, respectively. The leaves of B. juncea were collected after 6 weeks of cultivation in each environment. Considering the growth phase and varied distributions of leaf colors, 15-20 full-grown leaves from each cultivation environment were harvested. A total of 55 leaves were sampled and stored at −20 • C until further analysis. Spectral data were obtained for all the samples by using hyperspectral imaging. After four days of freezedrying and subsequent grinding, each sample's powder was divided into 20 mg portions in triplicate for analysis of the five functional components. Content values with a large deviation for each functional component were excluded, and the mean of the remaining values was used as the component content value.

Total Chlorophyll and Carotenoid Contents
Measurements were made using extracts prepared in triplicate by adding 2 mL of 90% MeOH containing 10% water (v/v) to 20 mg of each sample, followed by sonication for 1 h at 40 °C. The resulting crude extract was centrifuged at 4000 rpm and 4 °C for 20 min to separate plant debris and the supernatant. The supernatant was filtered using a 0.45 μm membrane filter; 1.5 mL of this filtrate was collected to use in quantification of functional components. For analysis, 150 μL from the 1.5 mL of the filtered supernatant was mixed with 90% MeOH containing 10% water to prepare 1.5 mL of a 10× diluted solution. Following the method described previously [30], the absorbance of the diluted sample solution was measured at 665.2, 652.4, and 470.0 nm wavelengths using a spectrophotometer (Cary 60 UV-Vis, Agilent Technologies, Santa Clara, CA, USA). The specification of spectrophotometer includes Xenon Flash Lamp (80 Hz) as a light source, measuring wavelengths from 190 nm to 1100 nm with a resolution of 1.5 nm. The scanning speed of this equipment is 24,000 nm per min. Absorbance (A) at each wavelength was used in Equations (1)-(3) to calculate total chlorophyll a, total chlorophyll b, and total carotenoids, respectively.

Total Chlorophyll and Carotenoid Contents
Measurements were made using extracts prepared in triplicate by adding 2 mL of 90% MeOH containing 10% water (v/v) to 20 mg of each sample, followed by sonication for 1 h at 40 • C. The resulting crude extract was centrifuged at 4000 rpm and 4 • C for 20 min to separate plant debris and the supernatant. The supernatant was filtered using a 0.45 µm membrane filter; 1.5 mL of this filtrate was collected to use in quantification of functional components. For analysis, 150 µL from the 1.5 mL of the filtered supernatant was mixed with 90% MeOH containing 10% water to prepare 1.5 mL of a 10× diluted solution. Following the method described previously [30], the absorbance of the diluted sample solution was measured at 665.2, 652.4, and 470.0 nm wavelengths using a spectrophotometer (Cary 60 UV-Vis, Agilent Technologies, Santa Clara, CA, USA). The specification of spectrophotometer includes Xenon Flash Lamp (80 Hz) as a light source, measuring wavelengths from 190 nm to 1100 nm with a resolution of 1.5 nm. The scanning speed of this equipment is 24,000 nm per min. Absorbance (A) at each wavelength was used in Equations (1)-(3) to calculate total chlorophyll a, total chlorophyll b, and total carotenoids respectively.
where A 665.2 is absorbance at 665.2 nm; A 652.4 is absorbance at 652.4 nm; A 470.0 is absorbance at 470 nm; Chla stands for total chlorophyll a; Chlb stands for total chlorophyll b; Chla + Chlb stands for total chlorophyll content; and Car stands for total carotenoid content. The data are expressed as mean ± standard deviation mg g −1 dry weight (DW) from biological triplicates.

Total Phenolic Contents
A previously described method [31] was used after modification for the detection of total phenolic content. For analysis, 100 µL from the 1.5 mL of supernatant (Section 2.2) was mixed with 100 µL of Folin-Ciocalteu reagent and 1.5 mL of distilled water; the mixture was incubated for 5 min at room temperature; then, 300 µL of 7.5% Na 2 CO 3 solution was added and the mixture was allowed to react for 1 h at room temperature. Absorbance was Agriculture 2022, 12, 1515 4 of 12 measured at 765.0 nm using a spectrophotometer. A standard curve prepared using 25, 50, 100, and 250 ppm gallic acid standard solutions was used to calculate total phenolic content in the samples from their measured absorbance. The data are expressed as mean ± standard deviation mg g −1 DW from biological triplicates.

Total Glucosinolate Contents
The extract prepared by the method described in 2.2 was used to obtain 1.5 mL of supernatant of the crude extract. For analysis, 50 µL from the 1.5 mL of supernatant was mixed with 150 µL of distilled water and 1.5 mL of 2 mM sodium tetrachloropalladate (Π); the mixture was left to react for 1 h at room temperature. The absorbance was measured at 425.0 nm using a spectrophotometer. The absorbance value was then used to calculate total glucosinolate content according to Equation (4) [32]: where A 425.0 is the absorbance at 425.0 nm. The data are expressed as mean ± standard deviation µmol g −1 DW from biological triplicates.

Total Anthocyanin Contents
The method described in a previous study [33] was modified and used for the estimation of the total anthocyanins. Sample extracts were prepared in triplicate. Briefly, 2 mL of acidic MeOH containing 1% HCl (v/v) was added to 20 mg of powder sample (Section 2.1) and sonicated for 1 h at 60 • C. The resulting crude extract was centrifuged and filtered in the same way as in the pre-processing of samples for total chlorophylls (Section 2.2). Then, 300 µL of the 1.5 mL of filtered supernatant was diluted by adding MeOH containing 1% HCl, and a 5× diluted solution of 1.5 mL volume was prepared. Absorbance of this solution was measured at 530.0 and 600.0 nm wavelengths, and the values obtained were used to calculate total anthocyanin contents according to Equation (5), as reported previously [33].
where A 530.0 is the absorbance at 530.0 nm; A 600.0 is the absorbance at 600.0 nm; V is the total volume of the extracted solution; n is the dilution ratio; Mw is the molecular weight of cyanidin-3-glucoside (i.e., 449.4); ε is the molar extinction coefficient of anthocyanin (29,600 M −1 cm −1 ); and m is the mass of the sample. The data are expressed as mean ± standard deviation mg g −1 DW from biological triplicates.

Hyperspectral Imaging
A hyperspectral imaging camera (MicroHSI 410 SHARK, Corning Inc., Corning, NY, USA) was used. The detailed specifications of the hyperspectral camera used in this study are summarized in Table 1. As shown in Figure 2, the hyperspectral imaging system was equipped with eight halogen lamps (15 W × 8) as light sources; for data scanning, the hyperspectral camera was moved by the conveyor at the top of a dark chamber that blocked all external light. Considering the time and speed for a single scan, two or three samples were measured per one hyperspectral image at a moving speed of 100 mm s −1 .

Data Processing and Prediction Models
The spectral data required for model development were extracted from the hyperspectral image data of B. juncea leaves, for which the spectral library in the Python 3.8 environment was used. After obtaining the RGB images of three bands (Blue: 456.55 nm, Red: 544.61 nm, and Green: 660.67 nm), the area of the sample for functional component analysis was chosen by polygon selection and set as the region of interest (ROI) ( Figure  3). An area of 1 cm width on the leaf edge was not included in the ROI or used in the analysis, as it was presumed to contain much background noise. Further, the area of the leaf vein was also excluded in the experiments due to the relatively low content of functional components. The average spectra of the ROI across all samples was used for PLSR model development after checking the correlation with each functional component based on absorbance wavelengths.
To predict each functional component, the extracted average spectra were applied to the Unscrambler X software v.11 (CAMO, Oslo, Norway) for model development through PLSR with full cross validation. The maximum number of PLS components used was limited up to 20. Various data pre-processing protocols were combined with PLSR and tested to ensure outstanding prediction performance of the developed model with reduced noise signal in the spectral data, including the Savitzky-Golay filter (S. G. filter) with 3 or 7 smoothing points, standard normal variate (SNV), multiplicative scatter correction (MSC), mean normalization, 1st-order derivative (1st-Der), and 2nd-order derivative (2nd-Der). The applied pre-processing combinations are listed in Table 2.

Data Processing and Prediction Models
The spectral data required for model development were extracted from the hyperspectral image data of B. juncea leaves, for which the spectral library in the Python 3.8 environment was used. After obtaining the RGB images of three bands (Blue: 456.55 nm, Red: 544.61 nm, and Green: 660.67 nm), the area of the sample for functional component analysis was chosen by polygon selection and set as the region of interest (ROI) (Figure 3). An area of 1 cm width on the leaf edge was not included in the ROI or used in the analysis, as it was presumed to contain much background noise. Further, the area of the leaf vein was also excluded in the experiments due to the relatively low content of functional components. The average spectra of the ROI across all samples was used for PLSR model development after checking the correlation with each functional component based on absorbance wavelengths.   Table 2. Pre-processing combinations tested in this study.

Pre-Processing Conditions 1
Raw data 2 Raw data, S. G. filter (interval = 3) 3 Raw data, S. G. filter (interval = 7) 4 Raw data, S. G. filter (interval = 3), SNV 5 Raw data, S. G. filter (interval = 3), MSC 6 Raw data, 1st-Der 7 Raw data, 2nd-Der 8 Raw data, SNV, 1st-Der To predict each functional component, the extracted average spectra were applied to the Unscrambler X software v.11 (CAMO, Oslo, Norway) for model development through PLSR with full cross validation. The maximum number of PLS components used was limited up to 20. Various data pre-processing protocols were combined with PLSR and tested to ensure outstanding prediction performance of the developed model with reduced noise signal in the spectral data, including the Savitzky-Golay filter (S. G. filter) with 3 or 7 smoothing points, standard normal variate (SNV), multiplicative scatter correction (MSC), mean normalization, 1st-order derivative (1st-Der), and 2nd-order derivative (2nd-Der). The applied pre-processing combinations are listed in Table 2. Table 2. Pre-processing combinations tested in this study.

Methods
Pre-Processing Conditions 1 Raw data 2 Raw data, S. G. filter (interval = 3) 3 Raw data, S. G. filter (interval = 7) 4 Raw data, S. G. filter (interval = 3), SNV 5 Raw data, S. G. filter (interval = 3), MSC 6 Raw data, 1st-Der 7 Raw data, 2nd-Der 8 Raw data, SNV, 1st-Der 9 Raw data, SNV, 2nd-Der 10 Raw data, Normalization The performance of the PLSR-based prediction models developed in this study was evaluated based on the R 2 and root mean square errors (RMSE) of calibration and validation, respectively, using Equations (6) and (7). From the prediction models developed with various pre-processing combinations, the model exhibiting comparatively high R V 2 and low RMSE for validation was selected.
where y i is the measured value of component obtained by analysis;ŷ i is the value predicted by the model; y is the mean value of component from the analysis; and n is the number of samples, which was a total of 55 numbers for this experiment.

Development of Visualization Software for Applying Predictive Models
Visualization software was produced and utilized so that the developed model could be recognized intuitively. PyQT5, OpenCV, Pillow, Spectral, Matplotlib, Numpy, Loguru, and Pandas libraries were used in a Python 3.8 environment. The GUI was constructed by using PyQT5. The predicted values of the components were calculated by multiplying the weights at each wavelength on the spectrum obtained at each pixel from hyperspectral image data. The predicted values for each pixel were visualized with a jet colormap. Color bar means the predicted value of functional component by PLSR models. Table 3 summarizes the results of the quantification of five main functional components found in the leaves of B. juncea (i.e., total chlorophyll, total carotenoid, total phenolic, total glucosinolate, and total anthocyanin content). With respect to total chlorophyll and total anthocyanin content, which are responsible for leaf color, the former was relatively low when the anthocyanin content was high. Total chlorophyll content ranged from 2.13 to 11.70 mg g −1 DW, with a mean value of 6.33 ± 2.21 mg g −1 DW. The amount of total carotenoids in B. juncea is generally low; it had the lowest mean value (0.91 mg g −1 DW) among the five functional components under study herein. The total phenolic contents ranged from 2.11 to 9.56 mg g −1 DW, with a mean of 4.85 ± 1.94 mg g −1 DW. A common feature of the family Brassicaceae is the abundance of glucosinolate components; therefore, B. juncea leaves exhibited the highest minimum, maximum, and mean values for total glucosinolate content among all five functional components analyzed. Anthocyanins are responsible for the red and blue color of B. juncea leaves, and the color intensity varies depending on the cultivation conditions. A high level of anthocyanins reportedly requires blue light, cool climate, and a large daily temperature range [34][35][36]. Consistently, in the field experiment, B. juncea cultivated in the cool outdoor field environment exhibited a relatively higher level of anthocyanin production, whereas B. juncea cultivated in the glass greenhouse under higher temperatures resulted in intense light-green-to-green leaves according to hyperspectral imaging. Thus, total anthocyanin content varied greatly from 0 to 33.80 mg g −1 DW, with a mean of 5.41 ± 6.75 mg g −1 DW.

Average Spectra and Correlation Analysis
The average spectra are shown in Figure 4A. The graphs showing the correlation of the contents of the five functional components at each wavelength of spectral data are shown in Figure 4B-F. Except for total phenolic contents, the other four functional components studied herein exhibited the highest negative correlation coefficient in the range of 400-600 nm of the visible light region of the spectrum. Total phenolic content showed a positive correlation in the wavelength bands corresponding to blue and green regions. This accounted for the high total phenolic content in B. juncea leaves with a relative increase in green areas, and an increase in total chlorophylls, total carotenoids, total glucosinolates, and total anthocyanins with a relative reduction in green areas. The potential use of prediction models for functional components quantitation via PLSR analysis was verified using correlation coefficient values of approximately 0.5 for all components, evenly distributed across the wavelength bands.

Development of PLSR Models Using Spectral Data Extracted from HSI
The PLSR prediction models for the prediction of five important functional components in plant leaf tissues were developed by applying various pre-processing combinations. The performance of each prediction model according to the pre-processing combi-

Development of PLSR Models Using Spectral Data Extracted from HSI
The PLSR prediction models for the prediction of five important functional components in plant leaf tissues were developed by applying various pre-processing combinations. The performance of each prediction model according to the pre-processing combination for each of the five functional components is shown in Table 4. Of all the five functional components studied, the highest prediction accuracy was recorded for total anthocyanins in almost all pre-processing methods. Notably, among the ten pre-processing combinations tested, the 6th combination registered R V 2 and RMSEP values of 0.8273 and 2.4277, respectively, indicating the highest level of prediction performance. This may be because the distribution of anthocyanins, referred to as 'purple magic' by Kim et al. [37], accounted for the largest purple areas over the plant body. When dark-red areas on B. juncea leaves were visibly greater, the anthocyanin content was higher, which may have influenced the spectrum extracted from hyperspectral images. In turn, the highest prediction performance for total phenolic contents was registered for the 4th pre-processing combination, i.e., S. G. filter and SNV. Apart from these two components, the highest prediction performance for total chlorophyll, total carotenoid, and total glucosinolate content was observed for the 8th pre-processing conditions with SNV and 1st-Der. Pre-processing condition SNV is shared by the 4th and the 8th combinations, which is a well-known pre-processing method of normalization based on the standard deviation of the overall spectrum to eliminate the influence of light scattering. Presumably, the variability of the spectrum-which may arise from the vibration of the migration module while producing the hyperspectral image data-had been calibrated. In addition, pre-processing by 1st-Der, which is shared by the 6th and the 8th combinations, is effective in calibrating the baseline variations originating from the difference in relative intensity of the light sources. This is because the method differentiates the spectrum to place an emphasis on the changes within the absorption bands to amplify the spectral variation, while only the variation is shown. This is presumed to account for the better performance of the prediction model with the 8th pre-processing combination for total chlorophylls, total carotenoids, and total glucosinolates. Table 4. PLSR prediction model outcomes on functional components for each pre-processing combination. The prediction model with the best performance among each pre-processing method was determined to have the lowest RMESP values (marked in bold). Notably, the prediction models for total chlorophyll and total carotenoid content led to lower performance in prediction performance, with R 2 < 0.3 for the 7th and 9th preprocessing combinations sharing pre-processing by 2nd-Der, which shows the characteristics of the changes in spectral slope to effectively calibrate the baseline and remove any micro-noise that may appear in the system, along with pre-processing by 1st-Der. Nevertheless, it is possible to interpret that the removal of noise resulted in the removal of the effects of absorption bands created by micro-components, thereby lowering prediction performance. In previous studies, total phenolics had been predicted at R C 2 = 0.820 and R V 2 = 0.551 by Caporaso et al. [29], and total anthocyanins had been predicted at R C 2 = 0.883 and R V 2 = 0.830 by Liu et al. [28]. However, the results reported herein were higher, at R C 2 = 0.8204 and R V 2 = 0.6909 for total phenolics, and at R C 2 = 0.9144 and R V 2 = 0.8273 for total anthocyanins. Furthermore, the 4th and 8th pre-processing combinations in the PLSR models for the five functional components resulted in the most outstanding prediction performance, implying that the use of pre-processing with SNV in the prediction spectrum for the five components is essential for the development of high-prediction performance models. Figure 5 depicts the results of the prediction model with the highest prediction performance among all models developed. The selected models did not have a very high prediction rate for functional components; nevertheless, the results verify their potential for predicting the concentrations of multiple functional components in actual cultivation conditions of B. juncea. In addition, the development of other regression models such as machine learning, mixed model, or principal components analysis have the potential to lead better predictive performance. It is judged that the combination of pre-processing methods used in this study is sufficiently worthy of reference for application to other regression models. processing methods used in this study is sufficiently worthy of reference for application to other regression models. Hyperspectral imagery spectroscopy is a technology that uses spectroscopic techniques and imaging which can offer the sample's optical information. Therefore, even if the amounts of components are the same, the spectral properties will not be the same if the distribution of these components in the plant are different. Thus, direct application of models in this study to other plants is difficult. New leaning data from other plants are required to make prediction model of new plants. However, it is noteworthy that the kind of pre-processing method and the methodology presented in this paper can be applied to new plants to develop prediction models for multiple components quickly. These trials can expand the scientific meaning and industrial uses related to engineering based on a hyperspectral imaging system.

Application of the Functional Component Prediction Model for Visualization
For the prediction models developed for the five plant functional components under study, visualization could probably be achieved in the form of a distribution map based on the prediction of component values by the prediction model from the spectrum values in the unit of pixels obtained from the hyperspectral image. Figure 6 shows such visuali- Hyperspectral imagery spectroscopy is a technology that uses spectroscopic techniques and imaging which can offer the sample's optical information. Therefore, even if the amounts of components are the same, the spectral properties will not be the same if the distribution of these components in the plant are different. Thus, direct application of models in this study to other plants is difficult. New leaning data from other plants are required to make prediction model of new plants. However, it is noteworthy that the kind of pre-processing method and the methodology presented in this paper can be applied to new plants to develop prediction models for multiple components quickly. These trials can expand the scientific meaning and industrial uses related to engineering based on a hyperspectral imaging system.

Application of the Functional Component Prediction Model for Visualization
For the prediction models developed for the five plant functional components under study, visualization could probably be achieved in the form of a distribution map based on the prediction of component values by the prediction model from the spectrum values in the unit of pixels obtained from the hyperspectral image. Figure 6 shows such visualization based on PLSR prediction models for each pixel representation of the concentrations of functional components according to the colors on the color map. Hence, the variation of different colors from high to low concentrations of functional components in the leaf can be detected intuitively with respect to the distribution using the color map. Similarly, cultivation in the open air under actual sunlight conditions is likely to allow enhanced accuracy of the prediction model with training data, which in turn is likely to enable monitoring of the functional components during cultivation. This should enable farmers to determine the best time for harvest based on the prediction of key functional components.

Conclusions
Using hyperspectral imaging, PLSR models were developed for the prediction of the multiple contents of five key functional components in the leaves of B. juncea: chlorophylls, carotenoids, phenolics, glucosinolates, and anthocyanins. To develop the models, the region of interest for analysis of the functional components was uniformly chosen by polygon selection from the hyperspectral image data, and the average spectra were extracted. Various pre-processing combinations were applied to the spectral data to develop a model which showed the most outstanding prediction performance. The resulting PLSR prediction models had R 2 ≥ 0.8 for total phenolic, total glucosinolate, and total anthocyanin contents, and valid models were thus obtained for each functional component. In addition, the models for total chlorophylls and total carotenoids had RC 2 of 0.6842 and 0.6775, respectively, which implied the potential for development of more efficient models via further data processing in the future. Overall, among the ten pre-processing conditions tested here, the 8th pre-processing condition combining SNV and 1st-Der resulted in the highest performance of prediction for functional components. Additionally, the 4th combination containing SNV exhibited outstanding prediction performance for total phenolic content. Hence, the most efficient pre-processing condition for the development of a prediction model exhibiting a high level of performance was SNV, shared by the 4th and 8th

Conclusions
Using hyperspectral imaging, PLSR models were developed for the prediction of the multiple contents of five key functional components in the leaves of B. juncea: chlorophylls, carotenoids, phenolics, glucosinolates, and anthocyanins. To develop the models, the region of interest for analysis of the functional components was uniformly chosen by polygon selection from the hyperspectral image data, and the average spectra were extracted. Various pre-processing combinations were applied to the spectral data to develop a model which showed the most outstanding prediction performance. The resulting PLSR prediction models had R 2 ≥ 0.8 for total phenolic, total glucosinolate, and total anthocyanin contents, and valid models were thus obtained for each functional component. In addition, the models for total chlorophylls and total carotenoids had R C 2 of 0.6842 and 0.6775, respectively, which implied the potential for development of more efficient models via further data processing in the future. Overall, among the ten pre-processing condi-tions tested here, the 8th pre-processing condition combining SNV and 1st-Der resulted in the highest performance of prediction for functional components. Additionally, the 4th combination containing SNV exhibited outstanding prediction performance for total phenolic content. Hence, the most efficient pre-processing condition for the development of a prediction model exhibiting a high level of performance was SNV, shared by the 4th and 8th pre-processing combinations. In addition, the 'multiple-chemical' analysis from one spectral-image data could be effectively performed. Considering the application of the models developed here, it is possible to draw a distribution map for functional components by applying the spectral data in pixels from the hyperspectral images to the prediction models. The variation of different colors from high to low concentrations of functional components in the leaf can thus be detected intuitively with respect to their distribution, as the data for the content value of each functional component was associated with the color map. Based on our findings, the application of the model to measure hyperspectral images under actual cultivation in natural sunlight conditions is likely to allow real-time monitoring of the components of interest. Furthermore, the prediction for the functional components will contribute to accurate determination of the best time for harvest.