Application of Multispectral Camera in Monitoring the Quality Parameters of Fresh Tea Leaves

: The production of high-quality tea by Camellia sinensis (L.) O. Ktze is the goal pursued by both producers and consumers. Rapid, nondestructive, and low-cost monitoring methods for monitoring tea quality could improve the tea quality and the economic beneﬁts associated with tea. This research explored the possibility of monitoring tea leaf quality from multi-spectral images. Threshold segmentation and manual sampling methods were used to eliminate the image background, after which the spectral features were constructed. Based on this, the texture features of the multi-spectral images of the tea canopy were extracted. Three machine learning methods, partial least squares regression, support vector machine regression, and random forest regression (RFR), were used to construct and train multiple monitoring models. Further, the four key quality parameters of tea polyphenols, total sugars, free amino acids, and caffeine content were estimated using these models. Finally, the effects of automatic and manual image background removal methods, different regression methods, and texture features on the model accuracies were compared. The results showed that the spectral characteristics of the canopy of fresh tea leaves were signiﬁcantly correlated with the tea quality parameters (r ≥ 0.462). Among the sampling methods, the EXG_Ostu sampling method was best for prediction, whereas, among the models, RFR was the best ﬁtted modeling algorithm for three of four quality parameters. The R 2 and root-mean-square error values of the built model were 0.85 and 0.16, respectively. In addition, the texture features extracted from the canopy image improved the prediction accuracy of most models. This research conﬁrms the modeling application of a combination of multi-spectral images and chemometrics, as a low-cost, fast, reliable, and nondestructive quality control method, which can effectively monitor the quality of fresh tea leaves. This provides a scientiﬁc reference for the research and development of portable tea quality monitoring equipment that has general applicability in the future.


Introduction
The Latin name of tea is Camellia sinensis (L.) O. Ktze, which is a popular beverage all over the world [1,2]. It is also an important cash crop in Qingyuan City, Guangdong Province, China, and dominates local agriculture as a characteristic industry [3,4]. Tea polyphenols, caffeine, free amino acids, total sugars, and other tea components have anti-oxidative, anti-cancerous, and anti-obesity characteristics, lower blood pressure, and prevent cardiovascular diseases [5][6][7][8][9][10]. In addition, the content of these components determines the qualities of taste, aroma, and appearance of tea [11], which, in turn, determine the tea quality and value [12]. Therefore, estimating and monitoring tea polyphenols and other quality parameters is significant in improving the tea quality and economic benefits associated with tea. The experiment was conducted three times in May, July, and September, 2020. The average temperature of the three experiments was 31, 35, and 30 °C, respectively. The experiment was conducted at noon on a clear and cloudless day. In each tea garden, more than 10 sampling points were randomly selected for spectral data and tea fresh leaf samples. Each sampling point was more than 10 m away from the edge of the road and the mutual distance was more than 100 m. The spectral image was taken through the vertical ground down and 1 m away from the canopy. The canopy spectral image including the calibration plate was taken three times for each sample point. At the same time, more than 250 g of one bud and two leaf samples were collected at the sample point for laboratory testing. The collected samples and spectral data included more than a dozen different tea varieties, mainly Yinghong No. 9, Huangdan, and Hongyan No. 12.
For the acquired spectral images, multi-band image registration, synthesis, reflectivity calculation, raster sampling, vegetation index calculation, and texture features extraction operations were performed in sequence. Correlation analysis and model training were performed on the obtained texture features and spectral features with laboratory test data, and, finally, the accuracy of the model was evaluated. The abovementioned data processing was implemented using MATLAB 2016b software, and the results were displayed using R studio software. The experimental steps are shown in Figure 2. For the acquired spectral images, multi-band image registration, synthesis, reflectivity calculation, raster sampling, vegetation index calculation, and texture features extraction operations were performed in sequence. Correlation analysis and model training were performed on the obtained texture features and spectral features with laboratory test data, and, finally, the accuracy of the model was evaluated. The abovementioned data processing was implemented using MATLAB 2016b software, and the results were displayed using R studio software. The experimental steps are shown in Figure 2.

Spectral Data
The ground multispectral data used in this study were collected by a multispectral camera (RedEdge-MX, Micasense, Seattle, WA, USA), which has been widely used in the field of agricultural remote sensing [29]. The spectral parameters of the multispectral sensor are shown in Table 1. The data acquisition system is shown in Figure 3.

Spectral Data
The ground multispectral data used in this study were collected by a multispectral camera (RedEdge-MX, Micasense, Seattle, WA, USA), which has been widely used in the field of agricultural remote sensing [29]. The spectral parameters of the multispectral sensor are shown in Table 1. The data acquisition system is shown in Figure 3. Table 1. Spectral parameters of multispectral sensor.

Band Number
Band Name Center Wavelength (nm) Bandwidth FWHM (nm )  1  Blue  475  20  2  Green  560  20  3  Red  668  10  4  Near-IR  840  40  5 Red Edge 717 10 FWHM = full-width-at-half-maximum  Tea canopy images were acquired through the multi-spectral camera by shooting at a height of 80 cm from the tea tree canopies, and downward, perpendicular to the ground. A standard white board was placed at the center of the camera's field of view and an attempt was made to ensure that the field of view completely included tea plants.

Quality Parameters
More than 250 g of fresh tea samples, with one bud and two leaves, were collected Tea canopy images were acquired through the multi-spectral camera by shooting at a height of 80 cm from the tea tree canopies, and downward, perpendicular to the ground. A standard white board was placed at the center of the camera's field of view and an attempt was made to ensure that the field of view completely included tea plants.

Quality Parameters
More than 250 g of fresh tea samples, with one bud and two leaves, were collected from the sampling points. The collected tea samples were dried and submitted to a thirdparty testing agency (Xi'an Guolian Quality Testing Technology Co. Ltd., Xi'an, China). Subsequently, the contents of tea polyphenols, caffeine, total sugars, and free amino acids were estimated by calculating them as a percentage of dry weight.

Image Processing
(1) Registration program and band fusion Tea canopy images were radiometrically calibrated with standard whiteboard digital numbers and digital number maps were converted to standard reflectance images to improve data quality. The sensor of each channel of the multispectral camera was distributed in an array, and the acquired images of each channel were spatially deviated; however, the cameras were not equipped with an automatic registration program. To facilitate band fusion and spectral information sampling, the Sift algorithm was used [30], and the features of each band image were automatically selected, matched, and finally, the bands were fused. An example of true color combination of the RGB three-channel combined image before and after registration is shown in Figure 4.  (2) Raster sampling To avoid the influence of soil and shadows in the captured images, the EXG [31] index was calculated in the combined image, and tea features were added in the image to distinguish tea from the background. Subsequently, the Ostu [32] method was used for image segmentation. After masking the background, pure tea areas in the image were finally obtained. The images corresponding to the above process steps are shown in Figure 5. Later, the average value of the area of the masked image was calculated. This group of average data is referred to as EXG_Ostu (EO) sampling data in the following sections. Additionally, the average value of the area of the image before removing the background was calculated. This dataset is referred to as Global (G) sampling data in the following section. Moreover, another dataset was added, 10 leaf positions on each image were manually added, and the average value was calculated. The data are referred to as Manual (M) sampling data in the following sections. The spectral data of the canopy tea leaves of the three sampling methods were extracted prior to further calculation and analysis. (2) Raster sampling To avoid the influence of soil and shadows in the captured images, the EXG [31] index was calculated in the combined image, and tea features were added in the image to distinguish tea from the background. Subsequently, the Ostu [32] method was used for image segmentation. After masking the background, pure tea areas in the image were finally obtained. The images corresponding to the above process steps are shown in Figure 5. Later, the average value of the area of the masked image was calculated. This group of average data is referred to as EXG_Ostu (EO) sampling data in the following sections. Additionally, the average value of the area of the image before removing the background was calculated. This dataset is referred to as Global (G) sampling data in the following section. Moreover, another dataset was added, 10 leaf positions on each image were manually added, and the average value was calculated. The data are referred to as Manual (M) sampling data in the following sections. The spectral data of the canopy tea leaves of the three sampling methods were extracted prior to further calculation and analysis. the following sections. Additionally, the average value of the area of the image before removing the background was calculated. This dataset is referred to as Global (G) sampling data in the following section. Moreover, another dataset was added, 10 leaf positions on each image were manually added, and the average value was calculated. The data are referred to as Manual (M) sampling data in the following sections. The spectral data of the canopy tea leaves of the three sampling methods were extracted prior to further calculation and analysis.

Spectral Feature Construction
Previous studies have confirmed that the vegetation index can effectively improve the relationship between plant spectral information and physical and chemical parameters. Based on the actual situation, 24 commonly used vegetation indices were selected for calculation. The names and calculation formulas of the indices are shown in Table 2. In addition, 5 original single bands, 3 color components in hue saturation value color space, 4 discrete first-order derivatives, and a total of 36 parameters were used as spectral features. In order to distinguish the vegetation index G from the Green channel, this article refers to the Green channel as g.

Texture Feature Extraction
In this study, gray level co-occurrence matrix (GLCM) [55][56][57][58] and local binary pattern (LBP) [59][60][61] methods were used to extract image texture features. The GLCM is a classic texture feature extraction method, which has been mostly used for auxiliary classification in previous research. Further, LBP has gained recent attention as a texture extraction method having a simple working principle and an excellent performance, and is mostly used for face recognition in artificial intelligence; Its basic coding principle is shown in Figure 6. The local differences in the tea canopy image are very subtle. As the EO and M sampling methods can destroy the image texture features, the standard whiteboard affects the texture features of the canopy image. In this study, 1/9th part of the upper right corner of the image was cropped to extract the LBP texture features; additionally, the GLCM extraction was set to 16 gray levels, a default direction, and a step size of 1. The principles of the two texture extraction methods are as follows: LBP: The calculation formula is: where (xc, yc) are the coordinates of the central pixel, P is the P th pixel in the field, ic is the gray value of the pixel, ip is the gray value of the central pixel, and S(x) is the sign function. GLCM: where D is the relative distance expressed in pixels; θ is the texture calculation direction parameter, which is generally 0°, 45°, 90°, or 135°; i, j = 0, 1, 2, ... L-1; (x, y) are the pixel coordinates in the figure; and L is the gray level. The statistics of grayscale images after GLCM and LBP re-encoding are generally used to describe features. In this study, energy (Asm), entropy (Ent), contrast (Con), correlation (Cor), and their respective variances were selected as the descriptors of the GLCM texture features. The average gray level (μ), mean square error (σ), skewness (S), kurtosis (K), energy (G), information entropy (E), and smoothness (R) of the histogram of the LBPencoded image were selected as the descriptors of the LBP texture features. These descriptors were calculated as follows: GLCM: Figure 6. Basic LBP coding method.
The calculation formula is: where (x c , y c ) are the coordinates of the central pixel, P is the P th pixel in the field, i c is the gray value of the pixel, i p is the gray value of the central pixel, and S(x) is the sign function. GLCM: where D is the relative distance expressed in pixels; θ is the texture calculation direction parameter, which is generally 0 • , 45 • , 90 • , or 135 • ; i, j = 0, 1, 2, . . . L − 1; (x, y) are the pixel coordinates in the figure; and L is the gray level. The statistics of grayscale images after GLCM and LBP re-encoding are generally used to describe features. In this study, energy (Asm), entropy (Ent), contrast (Con), correlation (Cor), and their respective variances were selected as the descriptors of the GLCM texture features. The average gray level (µ), mean square error (σ), skewness (S), kurtosis (K), energy (G), information entropy (E), and smoothness (R) of the histogram of the LBP-encoded image were selected as the descriptors of the LBP texture features. These descriptors were calculated as follows: GLCM: Remote Sens. 2021, 13, 3719 8 of 20 among which where P(g) is the LBP coded image, µ is the average gray level of the image, σ is the mean square error reflecting the average image contrast, S is the skewness reflecting the symmetry of the histogram distribution, and K is the closeness of the image gray level to the mean value. The kurtosis of G is the energy reflecting the image uniformity, E is the information entropy reflecting the randomness of the image grayscale, and R is the smoothness reflecting the relative smoothness of the image.

Feature Selection
Correlation analysis is a conventional and effective dimensionality reduction method. We analyzed the correlation between spectral and texture features and tea quality parameters and selected 10 features with the highest absolute values of correlation coefficients for linear regression to reduce calculations.

Regression Modeling
Several simple and effective regression modeling algorithms, such as partial least squares regression (PLS), support vector machine regression (SVR), and random forest regression (RFR), were used in this study. Among these, PLS is widely used to study the relationship between multiple dependent and independent variables. It combines the advantages of principal component analysis, normative analysis, and linear regression, Remote Sens. 2021, 13, 3719 9 of 20 and can effectively acquire the dominant factor with the strongest explanatory power for the dependent variable. PLS is especially used to solve problems, such as multicollinearity between variables or when the number of variables is more than the sample number [62,63]. The linear relationship between spectral data and chemical composition can be successfully modeled, especially in the presence of multiple dimensions and multicollinearity in the original spectral data [64]. SVR can provide a more rational solution to the above-mentioned problems than the linear method can [65]. SVR uses a kernel function to map input variables to a high-dimensional feature space [66]; therefore, it can process high-dimensional input vectors. Recently, SVR has been widely used in spectral analysis, subsequently producing accurate calibration results [67][68][69][70]. The RFR algorithm is an integrated learning algorithm that combines a large number of regression trees, which represent a series of conditions or constraints that are organized in a hierarchical structure and applied sequentially from the root to the leaves of the tree. RFR starts with multiple guide samples, which are randomly drawn from the original training dataset. Subsequently, the regression tree is applied to each bootstrap sample. A small group of input variables selected from the total set are randomly considered for the binary partitioning of each tree node [71][72][73].

Accuracy Evaluation
The coefficient of determination (R 2 ) and root-mean-square error (RMSE) were used to comprehensively evaluate the model accuracy. The verification method used random subsampling verification (hold-out method) and the two parameters were calculated as follows: where x i is the real measured value, y i is the predicted value, x is the average of the measured values, and n is the number of samples.

Correlation Analysis
The correlation between tea quality parameters and spectral indices is shown in Figure 7. The ten features having the highest correlation coefficients with the tea quality parameters are shown in Table 3. between variables or when the number of variables is more than the sample number [62,63]. The linear relationship between spectral data and chemical composition can be successfully modeled, especially in the presence of multiple dimensions and multicollinearity in the original spectral data [64]. SVR can provide a more rational solution to the above-mentioned problems than the linear method can [65]. SVR uses a kernel function to map input variables to a high-dimensional feature space [66]; therefore, it can process high-dimensional input vectors. Recently, SVR has been widely used in spectral analysis, subsequently producing accurate calibration results [67][68][69][70]. The RFR algorithm is an integrated learning algorithm that combines a large number of regression trees, which represent a series of conditions or constraints that are organized in a hierarchical structure and applied sequentially from the root to the leaves of the tree. RFR starts with multiple guide samples, which are randomly drawn from the original training dataset. Subsequently, the regression tree is applied to each bootstrap sample. A small group of input variables selected from the total set are randomly considered for the binary partitioning of each tree node [71][72][73].

Accuracy Evaluation
The coefficient of determination (R 2 ) and root-mean-square error (RMSE) were used to comprehensively evaluate the model accuracy. The verification method used random subsampling verification (hold-out method) and the two parameters were calculated as follows: where xi is the real measured value, yi is the predicted value,⎯x is the average of the measured values, and n is the number of samples.

Correlation Analysis
The correlation between tea quality parameters and spectral indices is shown in Figure 7. The ten features having the highest correlation coefficients with the tea quality parameters are shown in Table 3. It can be seen from Figure 5 that there are significant differences in the correlation analysis results between the spectral parameters and the quality parameters under the three sampling methods, but most of the correlations have reached a significant level, Figure 7. Correlation analysis results. TP, TS, TFAA, and C represent tea polyphenols, total sugar, total free amino acids, and caffeine, respectively. The blank sections indicate failed significance test. It can be seen from Figure 5 that there are significant differences in the correlation analysis results between the spectral parameters and the quality parameters under the three sampling methods, but most of the correlations have reached a significant level, which indicates that the quality parameters of tea are clearly related to the spectral parameters. The spectral response mechanism is consistent, which is the basis for steps in this study. In addition, this also means that different spectral data grid sampling methods will affect the strength of this connection and need to be considered.
Generally, the red (R), red-edge (ED), and near-infrared (NIR) bands are particularly sensitive to the physical and chemical properties of plants. In this study, most spectral features were significantly correlated with the quality parameters, and among the most relevant indexes, almost all were related to R, ED, and NIR band data, which were used to calculate and construct these indexes. This is consistent with previous research results, which observed a strong correlation between the crop physical and chemical parameters with the spectral characteristics.

Best Fit Sampling Method
The spectral feature parameters and quality parameters obtained by the EO, G, and M sampling methods were trained to build models through the PLS, SVR, and RFR algorithms, respectively. The final average prediction results of the three models for each sampling method are shown in Figure 8, which indicates that the R 2 values of the quality parameters of the EO sampling method and the spectral characteristic model are highest, and the overall RMSE is relatively low. This indicated that the EO sampling method used in this study can effectively reduce the impact of soil and other background noise, improve the data authenticity, and has an evidently positive effect on model prediction accuracy.
which indicates that the quality parameters of tea are clearly related to the spectral parameters. The spectral response mechanism is consistent, which is the basis for steps in this study. In addition, this also means that different spectral data grid sampling methods will affect the strength of this connection and need to be considered.
Generally, the red (R), red-edge (ED), and near-infrared (NIR) bands are particularly sensitive to the physical and chemical properties of plants. In this study, most spectral features were significantly correlated with the quality parameters, and among the most relevant indexes, almost all were related to R, ED, and NIR band data, which were used to calculate and construct these indexes. This is consistent with previous research results, which observed a strong correlation between the crop physical and chemical parameters with the spectral characteristics.

Best Fit Sampling Method
The spectral feature parameters and quality parameters obtained by the EO, G, and M sampling methods were trained to build models through the PLS, SVR, and RFR algorithms, respectively. The final average prediction results of the three models for each sampling method are shown in Figure 8, which indicates that the R² values of the quality parameters of the EO sampling method and the spectral characteristic model are highest, and the overall RMSE is relatively low. This indicated that the EO sampling method used in this study can effectively reduce the impact of soil and other background noise, improve the data authenticity, and has an evidently positive effect on model prediction accuracy.

Tea Varieties and Canopy Texture Features
Tea samples of more than 10 varieties were used in this study. Among these, Yinghong No. 9, Huangdan, and Hongyan No. 12 were the main varieties. The texture features of these three varieties were extracted using the texture information extraction method. The consequent results are shown in Figure 9. The descriptive statistics of texture information extracted by different varieties were evidently different. Particularly, the LBP texture features were more different than those of GLCM. This shows that texture features can help distinguish tea varieties, thereby promoting tea quality monitoring.

Tea Varieties and Canopy Texture Features
Tea samples of more than 10 varieties were used in this study. Among these, hong No. 9, Huangdan, and Hongyan No. 12 were the main varieties. The texture fe of these three varieties were extracted using the texture information extraction m The consequent results are shown in Figure 9. The descriptive statistics of texture mation extracted by different varieties were evidently different. Particularly, the LB ture features were more different than those of GLCM. This shows that texture fe can help distinguish tea varieties, thereby promoting tea quality monitoring.

Best Fit Modeling Algorithm
The EO method was used for sampling, and the predicted values of the tea q parameters calculated using the three models trained by the PLS, SVR, and RFR rithms were compared with the measured values. The results are shown in Figure 1 ferent best-fit model training methods were observed for predicting the results of v tea quality parameters. The PLS and SVR algorithms were the best fits for tea polyp prediction and least error, respectively. RFR was the best fit for the total sugar pred model and had the smallest error. RFR was also the best fit for the free amino acid p tion model and had the smallest error. Additionally, RFR was the best fit for the ca

Best Fit Modeling Algorithm
The EO method was used for sampling, and the predicted values of the tea quality parameters calculated using the three models trained by the PLS, SVR, and RFR algorithms were compared with the measured values. The results are shown in Figure 10. Different best-fit model training methods were observed for predicting the results of various tea quality parameters. The PLS and SVR algorithms were the best fits for tea polyphenol prediction and least error, respectively. RFR was the best fit for the total sugar prediction model and had the smallest error. RFR was also the best fit for the free amino acid prediction model and had the smallest error. Additionally, RFR was the best fit for the caffeine prediction model and had the smallest error. Thus, RFR was the best fitted modeling algorithm for three of the four quality parameters. Based on the comprehensive goodness of fit and error factors of tea polyphenols, the best fitted modeling algorithm was RFR.
Remote Sens. 2021, 13, x FOR PEER REVIEW 13 of 22 prediction model and had the smallest error. Thus, RFR was the best fitted modeling algorithm for three of the four quality parameters. Based on the comprehensive goodness of fit and error factors of tea polyphenols, the best fitted modeling algorithm was RFR. Figure 10. Results of different regression methods for tea polyphenols, total sugar, free amino acids, and caffeine.

Effect of Texture Features on Model Accuracy
The EO and M sampling methods can destroy the texture information of the original image; additionally, a standard whiteboard in the center of the image should be eliminated; therefore, we used the G sampling method and selected the 1/9 image scale in the upper right corner of the original image. While using this method, background objects, such as soil, should be absent in the selected image. The modeling method selected the best RFR, the accuracies of which before and after adding the texture features are shown Figure 10. Results of different regression methods for tea polyphenols, total sugar, free amino acids, and caffeine.

Effect of Texture Features on Model Accuracy
The EO and M sampling methods can destroy the texture information of the original image; additionally, a standard whiteboard in the center of the image should be eliminated; therefore, we used the G sampling method and selected the 1/9 image scale in the upper right corner of the original image. While using this method, background objects, such as soil, should be absent in the selected image. The modeling method selected the best RFR, the accuracies of which before and after adding the texture features are shown in Figure 11. After the integration of the tea polyphenol prediction model with texture features of LBP and GLCM, the result errors reduced, but the goodness of fit of the model did not improve. Conversely, after the prediction model of total sugar was integrated with GLCM texture features, the goodness of fit improved and the result errors increased. The LBP texture features did not improve the model goodness of fit and reduced the prediction result errors, whereas the goodness of fit and the prediction result errors of the model integrated with texture features of both GLCM and LBP improved. The prediction model of free amino acids integrated with texture features improved the goodness of fit and the accuracy of prediction results. The goodness of fit of the caffeine prediction model did not significantly improve after the integration of texture features, but GLCM texture features reduced the prediction result errors. Overall, the tea quality parameter monitoring model that integrated texture features showed higher prediction accuracy, with GLCM contributing more to improving the model accuracy than LBP.
Remote Sens. 2021, 13, x FOR PEER REVIEW 14 of 22 in Figure 11. After the integration of the tea polyphenol prediction model with texture features of LBP and GLCM, the result errors reduced, but the goodness of fit of the model did not improve. Conversely, after the prediction model of total sugar was integrated with GLCM texture features, the goodness of fit improved and the result errors increased. The LBP texture features did not improve the model goodness of fit and reduced the prediction result errors, whereas the goodness of fit and the prediction result errors of the model integrated with texture features of both GLCM and LBP improved. The prediction model of free amino acids integrated with texture features improved the goodness of fit and the accuracy of prediction results. The goodness of fit of the caffeine prediction model did not significantly improve after the integration of texture features, but GLCM texture features reduced the prediction result errors. Overall, the tea quality parameter monitoring model that integrated texture features showed higher prediction accuracy, with GLCM contributing more to improving the model accuracy than LBP. Figure 11. Effect of texture features on regression results for tea polyphenols, total sugar, free amino acids, and caffeine. The spectrum represents the near-use spectral features to be included in the regression, GLCM represents the addition of GLCM texture features based on spectral features to be included in the regression, and LBP represents the addition of LBP texture features based on spectral features to be included in the regression. The graphs represent the addition of GLCM and GLCM based on the spectral features. LBP texture features were included in the regression. Figure 11. Effect of texture features on regression results for tea polyphenols, total sugar, free amino acids, and caffeine. The spectrum represents the near-use spectral features to be included in the regression, GLCM represents the addition of GLCM texture features based on spectral features to be included in the regression, and LBP represents the addition of LBP texture features based on spectral features to be included in the regression. The graphs represent the addition of GLCM and GLCM based on the spectral features. LBP texture features were included in the regression.

Ground Multispectral Images
RedEdge is a 5-discrete-narrowband frame multispectral sensor that is commonly used in remote-sensing studies and precision agriculture [74][75][76][77][78]. It is considered stable and reliable, and as an improved product of hyperspectral technology, its costs have reduced drastically [25,26], thus broadening its applicability in tea quality monitoring. Previous studies have typically integrated RedEdge-mx in unmanned aerial vehicles [74,75,77,78], which means more complex data acquisition steps, lower spatial resolution, and more data processing procedures. Close-range applications are limited [76]; however, the ground portable handheld method proposed in this study can obtain reliable multi-spectral data more easily and accurately compared to application with unmanned aerial vehicles.
However, a small proportion of ground objects that are not classified as tea in the ground multispectral images can affect the final quality of the parameter monitoring results. In order to reduce the influence of soil and shadow noise and improve the accuracy of the final quality parameter monitoring results, this study used the EXG index to effectively distinguish the background of green vegetation and soil for image enhancement [31,[79][80][81][82]. The Ostu method was used for image segmentation [32,[83][84][85][86] to enable the effective extraction of the tea areas from the original image that contains other features. In comparison, the predicted results of the model using the EO sampling method were more accurate than the models using the G and M sampling methods. This was because background factors, such as soil, act as noise and interfere considerably with the sampling results of the G method; furthermore, the M sampling method completely depends on human subjective judgment and loses the objective representativeness of the sample. Conversely, the EO sampling method ensures objectivity while reducing the impact of noise. Notably, in this study, noise elimination is only applicable to green tea varieties. For tea images of other color varieties, the noise and the tea area will still be mixed. Finding a vegetation index that can enhance the characteristics of these nongreen tea varieties should help eliminate noise.

Vegetation Characteristics
Relative to the original spectral information, the vegetation index constructed by fusing multispectral bands can highlight specific vegetation characteristics, and is widely used to monitor plant physiological and biochemical parameters, such as biomass, total nitrogen content, and chlorophyll content [87][88][89]. Additionally, some vegetation indices can suppress the influence of soil noise [90,91]. Therefore, in this study, the vegetation index was calculated to enhance the relationship between the spectral characteristics and quality parameters, and eliminate the influence of soil and shadow noise. Correlation analysis results show that the vegetation indices with the strongest correlations with tea polyphenols, total sugars, amino acids, and caffeine were GNDVI (r = 0.544), BI (r = 0.812), GNDVI (r = 0.52), and WDRVI (r = 0.598), thus confirming the necessity of vegetation index calculation. In addition, the vegetation index showed the highest correlation (r ≥ 0.462) with tea quality parameters and was correlated with R, ED, and NIR. In fact, ED has been most widely used as a spectral feature for evaluating crop parameters [45,[92][93][94][95][96][97], and NIR is also a key component of most vegetation indices [98,99]. This indicates that the correlation analysis results in this study are consistent with those of previous studies. However, the vegetation index calculated in this study has certain limitations. The development of a method that more strongly correlates vegetation with tea quality parameters can improve the accuracy of tea quality monitoring results.
Spectral images have wide applications and integration with maps. It not only provides the spectral information of the target, but also obtains the spatial information [100][101][102]. In remote sensing image classification, the spatial location, shape, and texture characteristics of ground objects are particularly important [103][104][105]. In previous studies, some scholars have used remote sensing image texture information for machine classification, and have achieved good results [55,59,[106][107][108], but few studies have been conducted using the texture information for regression. Samples of different tea varieties were used in this study. Based on the relationship between genotype and expressiveness, the difference in texture features can help distinguish different varieties, thereby improving the estimation accuracy of quality parameters. Therefore, in this study, the texture features of GLCM and LBP were extracted from the multispectral image. As a classic texture feature extraction method, GLCM is widely used in machine vision, and its performance has been recognized by scholars [109][110][111][112][113]. In this study, 16-level grayscale and a step size of 1 were set when extracting GLCM texture features. Although the number of calculations is large, it can improve the detailed texture. The default direction was set because the distribution direction of the leaf canopy of tea leaves is random, and the difference caused by the direction parameter settings of different sliding windows is very small. In this study, the LBP texture feature extraction method, which was developed for better facial recognition, and has gradually been applied in facial recognition [114,115] and agriculture [116], was applied. In this study, the texture features of GLCM and LBP were combined with spectral features to estimate tea quality parameters. The subsequent results confirmed that the texture information can improve the accuracy of estimating the tea quality parameters, but its effect was not significant. This could be attributed to many reasons. First, to preserve the original image texture information, the upper right corner of the original image was used to extract the texture features, thus promoting the influence of noise, such as soil and shadows, in some images on the texture feature. Second, the 15 texture features selected in this study included GLCM and LBP, but the texture features that were actually closely related to the tea quality parameters were excluded. Finally, texture features and spectral features were input to the model for training simultaneously with dependent variables, and other feature integration methods, such as constructing hierarchical models, were not applied.

Modeling Methods
The PLS, SVR, and RFR algorithms are regression modeling methods that have been proven to be concise, stable, and effective in recent studies [67][68][69][70]. Among them, PLS is considered the simplest and does not require many parameters in the training process. Moreover, inputting variables can eliminate the multicollinearity in the x independent variables and reduce the data dimension. The obtained descriptive variables are the best choice for predicting the dependent variable y [117][118][119][120]. For fair comparison, in this study, the radial basis function kernel and other default parameters were set in the SVR model training, and 500 decision trees and other default parameters were set in the RFR model training. The PLS, SVR, and RFR algorithms took 6.063, 23.617, and 3.032 s to train with the same data in MATLAB, respectively. In this study, the RFR was the best model algorithm and had the highest operating efficiency. These characteristics facilitate the development and promotion of subsequent corresponding technologies and equipment. The analysis results did not indicate a considerable difference in the goodness of fit of the three models of tea polyphenols, total sugars, free amino acids, and caffeine. The R 2 of all the quality parameter monitoring models were between 0.33 and 0.85. The SVR model of tea polyphenols using the EO sampling method had R 2 > 0.4, whereas the RF model incorporated the texture features of LBP. According to the characteristics of SVR [121], the reason for the former is that the SVR model of tea polyphenols is underfitting, whereas the latter is because when the monitoring model fuses with LBP texture features, soil and shadow noise interfere; thus, the application of LBP texture features yields negligible improvement in the accuracy of tea polyphenols monitoring. Generally speaking, the R 2 value achieved is similar to results in previous research [122]. In this study, the RMSE differed considerably between the models, with a higher RMSE for PLS than the other two methods. This was because the PLS is a linear regression, and most data in practical situations do not show a simple linear relationship [65]. Therefore, the error of the prediction results of the linear models is higher than that of the nonlinear machine learning models.

Conclusions
This study investigated the application of low-cost, high-efficiency, high-precision, and easily applicable tea quality monitoring methods. R, ED, and NIR were the sensitive bands of tea quality parameters and are also sensitive to most other plants. The EO sampling method based on feature enhancement of the EXG index and the binary segmentation of the Ostu method assisted in acquiring more accurate and representative spectral sampling results. Compared with the G and M sampling methods, the EO method avoids soil, shadow, and human subjectivity. The influence of the above-mentioned factors facilitated the improvement of the prediction accuracy of the tea quality monitoring model. Furthermore, the GLCM and LBP texture features of the tea canopy image showed differences in the different tea varieties. To a certain extent, they improved the prediction accuracy of the of tea quality monitoring model, with the GLCM texture features contributing to a higher model accuracy than the LPB does. Among the four tea quality parameters of tea polyphenols, total sugars, free amino acids, and caffeine, the monitoring effect of total sugar was best (R 2 = 0.85 and RMSE = 0.16). Among the three modeling methods (PLS, SVR, and RFR), the RFR method showed the highest prediction accuracy. The proposed method can assist in developing universally acceptable portable tea quality monitoring equipment suitable for monitoring multiple tea varieties and can improve the tea monitoring efficiency and accuracy.
Author Contributions: Conceptualization and methodology, B.X.; data analysis and writingoriginal draft preparation, L.C.; writing-review and editing, D.D. and C.Z.; data curation, Q.C. and F.W. All authors have read and agreed to the published version of the manuscript.

Data Availability Statement:
The data that support the findings of this study are available from the corresponding author, upon reasonable request.